Implications of Evidential Decision Theory

Brian Tomasik · by **Brian Tomasik** on 2013-02-13T04:55:00

Paul Almond is one of the smartest people I know, and I need to get in the habit of reading more of his essays. One that I came across tonight was "On Causation and Correlation - Part 2: Implications of Evidential Decision Theory." I only skimmed it for now and would like to read it properly some time later, but I thought I would write quick notes about it in case that "some time later" never seems to happen.

I was interested to learn that Paul subscribes to evidential decision theory (EDT) in a similar way as I do. In the past, I had been a regular causal decision theorist (CDT), as most people are, but upon learning about EDT, I became more uncertain. [BTW, I wrote those linked Wikipedia articles on EDT and CDT in 2009. If you can believe it, there were no Wikipedia articles for these terms before then.]

In most of the standard thought experiments, I side with the EDT'ers. Even for the smoker's lesion problem, where the "correct" answer is usually that you should smoke because you can't change your genes, I'm uncertain what I would do.

I don't know if I understand timeless decision theory (TDT) properly, but if I do, then I think TDT is basically EDT applied only to your cognitive algorithms, not to everything. That is, if you choose X, it gives you evidence that the general cognitive algorithm you're running tends to output X. But it doesn't extend to non-cognitive parts of the universe. I would ask, "Why not? Your genes are another type of algorithm. Why not use EDT for them too?"

Anyway, I should read more of what Paul has to say on this later.

As for the article itself, it explains some really important ideas, a few of which I and friends have discussed informally but haven't written down systematically. There are many interesting points, but one that I wanted to highlight is the following. It suggests why, even for our own sake, we should ensure that sims are treated nicely and that religious fundamentalists don't take control of massive computational resources:

One way in which evidential decision theory would be relevant is in the way it allows
you to control the probability that you are in a simulation in the first place. If your
civilization decides to develop the capability to run simulated realities, then you are
meta-causing civilizations in general to do likewise (including civilizations on which our
own might be modeled), and making it less likely that almost all civilizations end before
they are capable of producing simulated realities, in turn making it more likely that you
are in a simulated reality. If, however, your civilization decides not to acquire this
capability then you are meta-causing civilizations in general to do likewise, making it less
likely that you are in a simulated reality. Once your civilization has the capability to
produce simulated realities, if your civilization decides to do it, this would make it more
likely that other civilizations also do it, again making it more likely that you are in a
simulated reality. On the other hand, if your civilization decides not to produce
simulated realities, this makes it less likely that other civilizations would choose to do
so, and therefore less likely that you are in a simulated reality yourself. [...]

Evidential decision theory is not restricted to the issue of whether we are in a simulated
reality. If we are in a simulated reality, it might be relevant in allowing us to control the
probabilities that we are in various kinds of simulation. If we construct many simulated
realities in which various things happen, then if another civilization is simulating us, we
might be meta-causing it to make those things happen to us. This creates an argument
for being kind to the inhabitants of any simulated realities that you do make.

DanielLC · by **DanielLC** on 2013-02-13T07:00:00

That is, if you choose X, it gives you evidence that the general cognitive algorithm you're running tends to output X.

That's not how I understand it.

You ignore all evidence and work only from your priors and the knowledge that in that situation you'd make whatever decision you make. You don't involve the fact that you actually are in that situation. For example, in Parfit's hitchhiker, you'd never be able to decide whether or not to pay the guy unless he picked you up, but you don't take this into account. You simply decide that, a priori, a universe where you would pay him is better than one where you would not.

Brian Tomasik · by **Brian Tomasik** on 2013-02-17T00:44:00

DanielLC wrote:You simply decide that, a priori, a universe where you would pay him is better than one where you would not.

A universe where you pay him in this situation is better. Hence, you're saying you'd like to find it the case that you decide to pay him. If you do resolve in your own mind that you want to pay him, that's evidence that you're the kind of person who would pay in situations like these, so he's more likely to believe you and accept the deal. Do we disagree?

DanielLC · by **DanielLC** on 2013-02-18T07:03:00

If you do resolve in your own mind that you want to pay him, that's evidence that you're the kind of person who would pay in situations like these, so he's more likely to believe you and accept the deal.

That's not the decision the paradox is about. The paradox is about the decision of whether or not to pay him once he has already rescued you. Once this happens, no choice you make can provide evidence that he rescued you, since it's certain he has no matter what you decide to do.

Brian Tomasik · by **Brian Tomasik** on 2013-02-19T08:48:00

What I meant was you want to discover that you have the kind of cognitive algorithm that choose to adopt the policy of paying him iff he rescues you. In other words, you want to discover that you're someone who keeps promises. If you are, then he'll save you. So you'd like to try your best to make yourself commit to promise-keeping in whatever forms that takes.

DanielLC · by **DanielLC** on 2013-02-20T07:31:00

So you'd like to try your best to make yourself commit to promise-keeping in whatever forms that takes.

You'd do that in any decision theory. It's not an interesting problem. The interesting problem is when your promise is held due.

Also, there are versions where you don't get a chance to make a promise. Perhaps you were already passed out when he picked you up, but he knew you well enough to know you'd pay.

Brian Tomasik · by **Brian Tomasik** on 2013-02-20T09:06:00

DanielLC wrote:You'd do that in any decision theory. It's not an interesting problem. The interesting problem is when your promise is held due.

Interesting. I guess what I meant was that even if you have no physical pre-commitment mechanisms, you want to find that your choice would be to pay him if you were to imagine the situation happening. I think that's what you meant in your first reply.

Arepo · by **Arepo** on 2013-02-20T12:37:00

To both of you: what is your objection to the non-DT response to the hitchhiker scenario that how well you can persuade the driver that you’ll give him money is not necessarily related to your actual likelihood of doing so?

DanielLC · by **DanielLC** on 2013-02-20T22:15:00

I guess what I meant was that even if you have no physical pre-commitment mechanisms, you want to find that your choice would be to pay him if you were to imagine the situation happening.

You would, and you'd be more likely to pay him because of it. If this is something that's going to happen on a regular basis, you'd pay him, knowing that if you don't you won't pay the next guy and he'll call your bluff. If it's not likely to come up again, then you're probably not going to pay him, assuming you're using EDT.

what is your objection to the non-DT response to the hitchhiker scenario that how well you can persuade the driver that you’ll give him money is not necessarily related to your actual likelihood of doing so?

If there's no correlation, it's a boring problem. We only discuss the more interesting variant in which there is a correlation.

Brian Tomasik · by **Brian Tomasik** on 2013-02-20T22:40:00

Arepo wrote:To both of you: what is your objection to the non-DT response to the hitchhiker scenario that how well you can persuade the driver that you’ll give him money is not necessarily related to your actual likelihood of doing so?

OLD REPLY: I think that's actually a DT-dependent response. To a causal decision theorist, once you were already rescued, then you would have no incentive to pay (ignoring reputation effects, repeated plays of the game, etc.). An evidential/timeless decision theorist would, I think, say that you want to find that your algorithm decides to pay even after it has been rescued, because this is the algorithm that lets you win.

NEW REPLY: Sorry, I missed the "not" in your "not necessarily" clause. Ok, in that case I agree with DanielLC that it's an uninteresting problem. (It's also not true that your persuasion abilities are not necessarily related to your choice when the driver has your source code and can simulate what you would do in the situation.)

Brian Tomasik · by **Brian Tomasik** on 2013-02-20T22:53:00

DanielLC wrote:If it's not likely to come up again, then you're probably not going to pay him, assuming you're using EDT.

OLD REPLY:

I don't see this. I think EDT would say that when I inspect my own thoughts about whether to pay him later, I'd like to find that I would pay him later, because discovering this makes it more likely I would in fact pay him later.

Recall that the literature conventionally says that EDT one-boxes on Newcomb, so I think the standard interpretation of EDT would win on Parfit's hitchhiker too.

NEW REPLY:

Hmm, now I see where DanielLC was coming from. Newcomb is potentially different from hitchhiker because in Newcomb, there's no time delay. When you choose to one-box, you get $1 million. In hitchhiker, it's conceivable that your choice could change between when you promise to pay and once you're actually rescued. As DanielLC said, once you're rescued, that's all the evidence you need that you were rescued, so you don't also need to find that you would pay out.

So maybe we need a cognitive-algorithm-focused spin on EDT sometimes, and that's what TDT is, AFAICT.

P.S.: I'm updating my replies a lot in part because I'm still confused about decision theory in general. This is a good learning exercise for me.

Arepo · by **Arepo** on 2013-02-21T14:41:00

Daniel & Alan wrote:If there's no correlation, it's a boring problem. We only discuss the more interesting variant in which there is a correlation.

...

NEW REPLY: Sorry, I missed the "not" in your "not necessarily" clause. Ok, in that case I agree with DanielLC that it's an uninteresting problem. (It's also not true that your persuasion abilities are not necessarily related to your choice when the driver has your source code and can simulate what you would do in the situation.)

I wholeheartedly agree it’s uninteresting in this case. Our dispute is whether it becomes interesting elsewhere.

OLD REPLY: I think that's actually a DT-dependent response. To a causal decision theorist, once you were already rescued, then you would have no incentive to pay (ignoring reputation effects, repeated plays of the game, etc.). An evidential/timeless decision theorist would, I think, say that you want to find that your algorithm decides to pay even after it has been rescued, because this is the algorithm that lets you win.

If you specify the situation such that intent and appearance don’t correlate, then me paying is inconsistent with the premise that I’m a rational self-utility maximiser (RSUM). Obviously you can call the not-paying consequence of me being an RSUM ‘[some particular DT]’ if you like, but I don’t see what you stand to gain.

If you specify the situation such that intent and appearance correspond, then me not intending to pay is inconsistent with the premise that I’m an RSUM.

If you specify a middle ground, such that I have positive expectation of the correlation between intent and appearance, then I run a utility calculation. If [the number of years of l expect to live on escaping the desert] * [my utility per year] * [the difference in probability I think sincerely vowing would make] is greater than the utility I expect the money he’s asking for would buy me conditional on my survival, then I vow sincerely, else I don’t.

But of the above three situations, only the first (and perhaps a very weak version of the second, such that I have an easy choice to not vow) seems at all plausible in this universe. The others remind me of Hare’s ‘how to argue with an anti-utilitarian’. They are fantastic examples disguised in real-world clothing, inadmissible as appeals to intuition, because they don’t contain anywhere near enough detail.

Firstly, how do we know – or have any idea of - the driver’s ability to perceive my intent? As in Newcomb’s paradox, the basic problem assumes we have just been given some sourceless knowledge which we’re expected to take at face value.

Secondly, why – other than a long tradition of abusing the term – are we so sure ‘intent’ refers to a genuine (apparently emergent) phenomenon? It seems to me near perfectly reducible to the feelings evoked when we contemplate certain situations, and where it’s imperfectly so, only because of its vagary; people occasionally slip in nuances when they use it.

If we so clarify ‘intent’, the emptiness of the ‘problem’ becomes even more apparent. Perhaps I can change the chemistry of my brain by willpower alone such that future-me becomes more likely to cooperate with present-me, and if so not doing so is inconsistent with the premise that I’m an RSUM.

If I can’t then clearly I don’t and admittedly now ‘I’ die because ‘I’ was ‘too rational’. But this isn’t biting any bullet – this is just looking through the range of logically conceivable outcomes and noting that I can’t guarantee winning the game. If I don’t have the requisite self-modifying ability, I lose. So what? I see no paradox in this, nor the need to consider anything other than expected value calculations to reach this point, and nothing that importing any further concepts from ‘decision theory’ would do to improve my chances of winning.

(Incidentally the deflationary view of personal identity - which I’d more aggressively refer to as its deflationary nature, since any alternative seems ludicrous - makes this entire discussion empty on its original premises. There was no essential ‘me’ for a perfectly logical creature to want to preserve – the tension between present-me and future-me just showcases that – so the scenario is incoherent.

I might be able to ironman the setup by restating it such that I’m a utilitarian and need to survive for the greater good, but know that money given to the driver will be wasted, in which case I think the scenario just about makes logical sense. But as above, I don’t see a paradox or problem, per se. I play the game with the best strategies available to me given the rules, and if I lose, so be it.)

DanielLC · by **DanielLC** on 2013-02-22T07:04:00

Our dispute is whether it becomes interesting elsewhere.

When would it be interesting?

Obviously you can call the not-paying consequence of me being an RSUM ‘[some particular DT]’ if you like, but I don’t see what you stand to gain.

As is the case with many uninteresting problems. If you see a winning lottery ticket on the ground, should you pick it up? If you're standing next to a cliff, should you jump off?

If you specify a middle ground, such that I have positive expectation of the correlation between intent and appearance, then I run a utility calculation. If [the number of years of l expect to live on escaping the desert] * [my utility per year] * [the difference in probability I think sincerely vowing would make] is greater than the utility I expect the money he’s asking for would buy me conditional on my survival, then I vow sincerely, else I don’t.

Your choice of whether or not the vow is sincere, i.e. whether or not you fulfill it, is made after you've made the vow and you were picked up. How do you justify a nonzero difference in probability of survival after you know you've been picked up?

Firstly, how do we know – or have any idea of - the driver’s ability to perceive my intent?

You could tell by his track record. You could also just be a bad liar. In this case, you could think of it as past!you trying to tell if future!you will keep the promise. Past!you knows you very, very well, and is likely to make a good prediction.

Secondly, why – other than a long tradition of abusing the term – are we so sure ‘intent’ refers to a genuine (apparently emergent) phenomenon?

We know that either you will fulfill your promise or you will break it. We presumably know that the other guy can predict whether or not it will be broken at a rate significantly above chance.

this is just looking through the range of logically conceivable outcomes and noting that I can’t guarantee winning the game.

The strategy of paying the guy yields a significantly higher success rate than the strategy of not paying him. Is this not logically conceivable? Do you feel that because you can't be certain that following your strategy will lead to a worse outcome, it doesn't matter that it usually does?

There was no essential ‘me’ for a perfectly logical creature to want to preserve

There was still someone. If you value people in general, you'd want him to not die. Also, he's quite a lot like you, and he will likely bring what you value.

Arepo · by **Arepo** on 2013-02-22T14:13:00

DanielLC wrote:When would it be interesting?

It would be interesting to me (in this context) iff some aspect of decision theory (that did not conceptually predate the naming of the decision theory under which it fell) could improve the analysis.

Your choice of whether or not the vow is sincere, i.e. whether or not you fulfill it, is made after you've made the vow and you were picked up. How do you justify a nonzero difference in probability of survival after you know you've been picked up?

Not entirely. If I make the vow imagining that I’ll keep it, something different has happened from whether I make the vow imagining that I’ll break it. But if not, and if we’re still claiming that the driver’s behaviour is linked to whether future-me keeps it despite no difference in present-me, then we’re in Omega territory – a world so alien that anything goes and nothing from it matters in this one.

You could tell by his track record. You could also just be a bad liar.

There are many ways in which you *could*, but if the thought experiment is supposed to change our intuition, it's worthless unless we demand high precision of it. Not an ad hoc set of possible justifications for the wild claims, but an actual scenario up front, described with all the salient details, where demonstrating an unrealistic feature would render the whole thing irrelevant to our intuitions.

Secondly, why – other than a long tradition of abusing the term – are we so sure ‘intent’ refers to a genuine (apparently emergent) phenomenon?

We know that either you will fulfill your promise or you will break it. We presumably know that the other guy can predict whether or not it will be broken at a rate significantly above chance.

What does this have to do with my criticism of ‘intent’? Whatever we refer to by the word, I think we can agree that ‘I intend x’ isn’t logically equivalent to ‘X will happen’.

The strategy of paying the guy yields a significantly higher success rate than the strategy of not paying him. Is this not logically conceivable? Do you feel that because you can't be certain that following your strategy will lead to a worse outcome, it doesn't matter that it usually does?

What on earth makes you say it ‘usually leads to a worse outcome’? My ‘strategy’ is basically defined as being the best approach: ‘Maximise expected value’. It deals with all the possible outcomes with higher expected value than any different strategy. In some scenarios it will mean paying, in most it won’t, and I don’t need a ‘decision theory’ to cover each (or really any) of those possibilities.

If I ever lose, it’s only because I’ve taken a maximal-expectation gamble and it hasn’t paid off. So what? I’d do the same again any time I had the chance.

There was still someone. If you value people in general, you'd want him to not die. Also, he's quite a lot like you, and he will likely bring what you value.

If I’m a rational pure self-interest-maximiser, the only thing I value is me, so since he won’t be me a) he won’t value the same thing and b) it wouldn’t make a difference if he did, because now-me would still be gone. As I said, I think I can ironman out this criticism, so let’s not dwell on it.

DanielLC · by **DanielLC** on 2013-02-22T23:32:00

Past!you is capable of predicting what future!you will do fairly well. There's not a whole lot future!you can do to hide your decision from present!you. Past!you could lie about your decision, but he will likely be caught in his lie. He could intentionally fail at predicting future!you's actions, but he will know it isn't an accurate prediction, and if he claims it is, he will likely be caught in his lie. Future!you's decision is entangled with past!you's prediction. Past!you's prediction is entangled with Paul Ekman's decision of whether or not to pick you up. As such, future!you's decision is entangled with Paul Ekman's decision.

What does this have to do with my criticism of ‘intent’? Whatever we refer to by the word, I think we can agree that ‘I intend x’ isn’t logically equivalent to ‘X will happen’.

No, but I think the thought experiment works better if you use "x will happen" instead of "I intend x to happen". If you want something closer to the normal meaning, perhaps "I predict that I will do x". That's still not how "intent" is normally defined. I'd say the original thought experiment is badly worded.

What on earth makes you say it ‘usually leads to a worse outcome’?

Those who attempt your strategy are worse off than those who do not. Of those who actually get the chance to make the decision, those who attempt your strategy do better, but it's less likely for someone who attempts your strategy to make it to that point.

It occurs to me that this thread is named after EDT, but you're talking about an argument to distinguish EDT from TDT. I think I can argue better if we go with one to distinguish CDT from EDT, such as Newcomb's paradox.

Arepo · by **Arepo** on 2013-02-23T11:38:00

DanielLC wrote:Past!you is capable of predicting1 what future!you will do fairly well. There's not a whole lot future!you can do to hide your2 decision from present!you. Past3!you could lie about your decision, but he will likely be caught in his lie. He could intentionally fail at predicting future!you's actions, but he will know it isn't an accurate prediction, and if he claims it is, he will likely be caught in his lie. Future!you's decision is entangled with past!you's prediction1. Past3.1!you's prediction is entangled with Paul Ekman's decision of whether or not to pick you up. As such, future!you's decision is entangled with Paul Ekman's decision.

I think I agree much, maybe all of this in spirit until the last line, but at least one of us is misexplaining some details. The bits I've bolded and numbered are those where it doesn't hold for me, unless you replace them with something like the following:

1 'demonstrates/is evidence for' (via his track record of behaviour plus the fact that he shares so much personality makeup with future - and present - me)
2 Just a semantic quibble I think, but for clarity, 'his'
3 Surely 'Present'? Esp in 3.1, we're not given any indication in the original thought experiment that he has any idea who we are. If you want to rewrite it so he does, I'm happy to consider that, but can we stick to one intuition-appeal at a time?

And the last line just doesn't seem to follow from anything. If A increases the likelihood of B it doesn't mean

That's still not how "intent" is normally defined. I'd say the original thought experiment is badly worded.

Can you rephrase it so that we're sure we're not arguing at cross purposes? My underlying claim is something like 'for any fully described variation on Parfit's hitchhiker, Newcomb's paradox or indeed any other such thought experiment, there is no utilitarian gain from thinking about it in terms of multiple possible 'decision theories' rather than expected utility", so to disprove the claim we need a well-described variant and a demonstration of how expected utility is inadequate.

Those who attempt your strategy are worse off than those who do not. Of those who actually get the chance to make the decision, those who attempt your strategy do better, but it's less likely for someone who attempts your strategy to make it to that point.

You keep asserting this, but I see no reason at all to believe it, or rather, no way of even parsing it. Can you clarify what you mean by a) 'My strategy', b) Any alternative and then explain to me how the alternative does better. I think 'my strategy' is - by definition - so general as to contain any possible best solutions. As such it is practically useless (or just not a strategy per se), and yet you seem to be saying there's a broader category into which it fits, which contains some other information without which I'm actually excluding plausible best solutions.

So many smart people with similar worldviews to mine make this claim that even though it seems conceptually incoherent I think I'm emotionally open to being persuaded of it, but it seems to entail answering some very basic questions that said smart people seem collectively unable to answer or even parse.

It occurs to me that this thread is named after EDT, but you're talking about an argument to distinguish EDT from TDT. I think I can argue better if we go with one to distinguish CDT from EDT, such as Newcomb's paradox.

I would prefer the focus to remain on DTs in general. I just don't see any way in which they aren't spurious metainformation.

DanielLC · by **DanielLC** on 2013-02-23T19:00:00

Surely 'Present'?

I don't like picking an arbitrary moment in the experiment to call the present, so I'm referring to the point further in the past as past, and the point further in the future as future.

Can you rephrase it so that we're sure we're not arguing at cross purposes?

That's probably something you should be doing yourself, but sure.

Past!you is in the desert. Paul Ekman offers to give past!you a ride for $500. Past!you does not have money, but Paul says that he's willing to let future!you pay, so long as past!you thinks he will. However, he can tell if past!you is lying. He asks past!you if he has predicted future!you's actions to the best of his ability, and, if so, whether future!you will pay.

Past!you has access to your source code, for obvious reasons. Not even Past!you can run a perfect simulation of future!you, but he cannot be intentionally fooled. Given that you would make a certain decision under a certain circumstance, he has a 90% chance of correctly predicting it. If future!you would have an epiphany that changes his strategy of making decisions, past!you's simulation of future!you is 90% likely to have that epiphany and make that decision.

You keep asserting this, but I see no reason at all to believe it, or rather, no way of even parsing it.

People who would not pay die 90% of the time and save $500 the other 10%. People who would pay do better.

Arepo · by **Arepo** on 2013-02-26T22:58:00

DanielLC wrote:That's probably something you should be doing yourself, but sure.

If there is a version of the thought experiment I could imagine of which would show me to be wrong, then because it's a thought experiment I'd already know that I was wrong.

Past!you is in the desert. Paul Ekman offers to give past!you a ride for $500. Past!you does not have money, but Paul says that he's willing to let future!you pay, so long as past!you thinks he will. However, he can tell if past!you is lying. He asks past!you if he has predicted future!you's actions to the best of his ability, and, if so, whether future!you will pay.

Past!you has access to your source code, for obvious reasons. Not even Past!you can run a perfect simulation of future!you, but he cannot be intentionally fooled. Given that you would make a certain decision under a certain circumstance, he has a 90% chance of correctly predicting it. If future!you would have an epiphany that changes his strategy of making decisions, past!you's simulation of future!you is 90% likely to have that epiphany and make that decision.

Ok. Now tell me why your view ever wins over mine. I take it I am not allowed to assert that I would change future me's source code now, so that he believed me? If not, why would I have been able to do so in the past?

DanielLC · by **DanielLC** on 2013-02-27T05:47:00

Now tell me why your view ever wins over mine.

It's not actually my view. I lean more towards EDT. It is tempting though.

TDT wins because the TDT agent (usually) lives and the CDT and EDT agents (usually) die. What's the point of being rational if it just gets you killed?

I take it I am not allowed to assert that I would change future me's source code now, so that he believed me?

You can't precommit, if that's what you mean.

If not, why would I have been able to do so in the past?

I don't understand. When were you allowed to change your source code?

You can change your strategy. You can make whatever code you want future!you to follow and then start following it. You just can't force future!you to actually follow it. If you decide to be a TDT agent, you can be a TDT agent. However, if you want to be CDT and yet have future!you be TDT agent, then future!you will likely decide not to follow this strategy, and be a CDT agent and try to convince further future!you to change to TDT. To put it another way, one can't expect future!you to follow through on present!you's promises if present!you isn't willing to follow through on past!you's promises.

Also, you can modify the problem so that you simply don't have a chance to precommit. It does make it hard to make it realistic though (unless you accept the doomsday argument). Perhaps Omega finds you already passed out in the desert, scans your brain, and uses that to decide whether or not you'd pay.

Arepo · by **Arepo** on 2013-03-03T12:02:00

DanielLC wrote:It's not actually my view. I lean more towards EDT. It is tempting though.

TDT wins because the TDT agent (usually) lives and the CDT and EDT agents (usually) die. What's the point of being rational if it just gets you killed?

This is a description of winning means, not an explanation of why TDT will do it where the normal value-maximizing principle won't.

I don't understand. When were you allowed to change your source code?

You can change your strategy. You can make whatever code you want future!you to follow and then start following it. You just can't force future!you to actually follow it. If you decide to be a TDT agent, you can be a TDT agent.

How is 'decide to be a TDT agent' not a subset of 'change my source code'? Presumably I wasn't a TDT agent before.

DanielLC · by **DanielLC** on 2013-03-04T06:48:00

This is a description of winning means, not an explanation of why TDT will do it where the normal value-maximizing principle won't.

A TDT agent is willing to pay, and will be rescued. A CDT or EDT agent is not, and will not.

How is 'decide to be a TDT agent' not a subset of 'change my source code'?

You are your source code. Deciding to be a TDT agent and having the source code of a TDT agent are the same thing. If you decide to be a TDT agent, there is no need to change your source code, because it already makes you a TDT agent.

Arepo · by **Arepo** on 2013-03-06T17:08:00

A TDT agent is willing to pay, and will be rescued. A CDT or EDT agent is not, and will not.

Why? What algorithm is he following? You can’t just assert that it’s one that’s functionally indistinguishable from others in almost any situation but that magically works here and expect me - or Paul Eckerman - to heed you.

You are your source code. Deciding to be a TDT agent and having the source code of a TDT agent are the same thing. If you decide to be a TDT agent, there is no need to change your source code, because it already makes you a TDT agent.

This is sophistry. Assuming I am a TDT, between birth and point A, I was something other than a TDT. Then I became a TDT. If I am my source code, that involved changing my source code. If that was allegedly possible then, why is it not allegedly possible now that Paul Eckerman is standing in front of me? You seem to just be defining this as suddenly having become impossible. I see no reason to accept your assertion that giving myself a ‘reboot with TDT software’ (whatever that is) instruction is impossible now but was possible two weeks ago.

DanielLC · by **DanielLC** on 2013-03-07T02:44:00

Why?

Because he's following an algorithm that results in paying.

What algorithm is he following?

Take the action such that, given that you would take this action in this situation, but not given anything else like the fact that you already saw Paul Eckerman pick you up, maximizes expected utility.

If that was allegedly possible then, why is it not allegedly possible now that Paul Eckerman is standing in front of me?

Like all humans, your source code is constant, but you don't have full control over it.

Arepo · by **Arepo** on 2013-03-07T12:25:00

Take the action such that, given that you would take this action in this situation, but not given anything else like the fact that you already saw Paul Eckerman pick you up, maximizes expected utility.

If your algorithm refers to anything about the situation at hand, then it’s clearly not applicable to decision-making in general. Maybe all I’m ultimately asking for is a description of TDT, but I want someone to show me specifically why it achieves something that ‘maximise expected utility’ doesn’t, and furthermore, something which is inconsistent with that maxim.

A thought I just had that might be behind our disagreement:

Would you claim that ‘maximise expected utility’ is actually two proposals rolled into one – one defining your overall goal in life G, one giving you an algorithm A to follow from moment to moment?
If so, you might be claiming that TDT is consistent with (and a specific strategy for achieving G) but inconsistent with and sometimes superior to A.

If so I think I would still probably disagree on both points – that G is not equivalent to A and that TDT is sometimes superior to A, but since the second disagreement follows from the first we could disregard the second and possibly gain greater focus.

Like all humans, your source code is constant, but you don't have full control over it.

This isn’t an answer. Why do I not have enough control to change it now, if I did N time ago?

DanielLC · by **DanielLC** on 2013-03-07T23:58:00

If your algorithm refers to anything about the situation at hand, then it’s clearly not applicable to decision-making in general.

I meant any given situation.

That being said, even if all I gave was a decision theory that referred to that situation in particular, it would suggest that it's part of a larger pattern.

I want someone to show me specifically why it achieves something that ‘maximise expected utility’ doesn’t

"Maximize expected utility" is not well-defined. I could tell you how it does better than CDT or EDT, but that's it.

EDT tells you not to pay Paul because the probability of survival given that he rescued you and you pay him is no higher than the probability of survival given that he rescued you and you do not. Because an EDT agent would make this decision, he dies.

TDT tells you to pay Paul, because if you're more likely to be alive given only that you pay Paul. Because a TDT agent would make this decision, he lives.

Why do I not have enough control to change it now, if I did N time ago?

You can decide to follow any decision theory. You cannot simply decide for your future self to fulfill promises you make now. This is not just an assumption the problem has. This is a fact about real life. If you want to know why, ask a psychologist.

Implications of Evidential Decision Theory

Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory

Re: Implications of Evidential Decision Theory