User:IssaRice/AI safety/Possibility of act-based agents

From Machinelearning
Revision as of 22:11, 5 October 2019 by IssaRice (talk | contribs) (Created page with " "If a pre­dic­tor can pre­dict a sys­tem con­tain­ing con­se­quen­tial­ists (e.g. a hu­man in a room), then it is us­ing some kind of con­se­quen­tial­ist ma...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


"If a pre­dic­tor can pre­dict a sys­tem con­tain­ing con­se­quen­tial­ists (e.g. a hu­man in a room), then it is us­ing some kind of con­se­quen­tial­ist ma­chin­ery in­ter­nally to make these pre­dic­tions. For ex­am­ple, it might be mod­el­ling the hu­man as an ap­prox­i­mately ra­tio­nal con­se­quen­tial­ist agent. This pre­sents some prob­lems. If the pre­dic­tor simu­lates con­se­quen­tial­ist agents in enough de­tail, then these agents might try to break out of the sys­tem." [1]

"I think we have a foundational disagreement here about to what extent saying 'Oh, the AI will just predict that by modeling humans' solves all these issues versus sweeping the same unsolved issues under the rug into whatever is supposed to be modeling the humans." [2]

"For ex­am­ple cur­rently I find it re­ally con­fus­ing to think about cor­rigible agents rel­a­tive to goal-di­rected agents." [3]