click below
click below
Normal Size Small Size show me how
Phil 3
| Question | Answer |
|---|---|
| Hilton’s Argument: | It's somewhat likely that AI will cause an existential catastrophe |
| What kind of AI should we worry about? | having goals + making plans to achieve them having strategic awareness Human-level (or better) capabilities when it comes to: Persuasion/manipulation Hacking Scientific research Business/military strategy |
| What is misalignment? | It’s when a goal-directed system aims at goals that aren’t good goals (neither relative to human interests nor objectively). This happens a lot when the system’s reward differs—even slightly—from what we really care about. |
| how it could go bad: | Human disempowerment and/or extinction |
| The Most Pressing World Problems | Risks from artificial intelligence Catastrophic pandemics Nuclear war Great power conflict |
| “Other Pressing World Problems” | Global health–deaths from, e.g, HIV and malaria Climate change Safeguarding liberal democracy Unfair and harmful immigration restrictions |
| reward misspecification | an AI becomes great at achieving the thing it was rewarded for during training, but in a way that doesn’t get at what really matters. |
| goal misalinment | even though the AI is good at achieving the thing it’s rewarded for during training, that strategy fails when it’s moved to a new environment. |
| Instrumental Convergence | No matter what final goals we fix for the AI, any planning AI will also develop certain instrumental goals—intermediate goals that promote success in its final goals. |
| instruamental goals | Self-preservation (“you can’t fetch the coffee if you’re dead”) Preventing serious changes to the system’s current goals Gaining more resources and capabilities to help with achieving goals |
| Expected Utility | if its a good or bad deal |
| Effective Altruism | Altruism: It’s very morally important to help others who are suffering or in danger. Effectiveness: We should choose the actions that most effectively help others. |
| Longtermism | Longtermist EAs are redirecting money from people who are definitely dying right now (e.g. the Against Malaria Foundation) just to reduce the possibility of a bad thing happening in the future (e.g. AI apocalypse). |
| Explain how Peter Singer's drowning-child example has been used to support effective altruism | if a child is drowining and the room its in is locked and you hvae to pay to enter but thats all you have for the vending machine that day you should still save the child |
| why are Effective altruists impartial | by targeting those who need it most. You shouldn’t favor helping people just because they’re nearby, or your friends, or from your nation, racial in-group, etc. |
| how does Srinivasan explain charitable work should be impatal | by saying we're triaging like ER doctors |
| why do alterests treat AI alignment as a more pressing matter then global poverty | because AI could possibly ripe out the human race if we're not careful |
| how can ai harm the enviroment | data centers require a lot of energy make climate change worst |
| harm princaple | Sinnott-Armstrong says: A joyride in a gas-guzzling car (like a single use of GenAI) does not harm anyone. |
| The general action principle | Sinnott-Armstrong’s Response: It’d be worse (in fact, disastrous) if everyone chose not to study medicine. But that doesn’t mean everyone is individually obligated to study medicine. |