1*4882a593Smuzhiyun======================== 2*4882a593SmuzhiyunHow to get s2ram working 3*4882a593Smuzhiyun======================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun2006 Linus Torvalds 6*4882a593Smuzhiyun2006 Pavel Machek 7*4882a593Smuzhiyun 8*4882a593Smuzhiyun1) Check suspend.sf.net, program s2ram there has long whitelist of 9*4882a593Smuzhiyun "known ok" machines, along with tricks to use on each one. 10*4882a593Smuzhiyun 11*4882a593Smuzhiyun2) If that does not help, try reading tricks.txt and 12*4882a593Smuzhiyun video.txt. Perhaps problem is as simple as broken module, and 13*4882a593Smuzhiyun simple module unload can fix it. 14*4882a593Smuzhiyun 15*4882a593Smuzhiyun3) You can use Linus' TRACE_RESUME infrastructure, described below. 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunUsing TRACE_RESUME 18*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~ 19*4882a593Smuzhiyun 20*4882a593SmuzhiyunI've been working at making the machines I have able to STR, and almost 21*4882a593Smuzhiyunalways it's a driver that is buggy. Thank God for the suspend/resume 22*4882a593Smuzhiyundebugging - the thing that Chuck tried to disable. That's often the _only_ 23*4882a593Smuzhiyunway to debug these things, and it's actually pretty powerful (but 24*4882a593Smuzhiyuntime-consuming - having to insert TRACE_RESUME() markers into the device 25*4882a593Smuzhiyundriver that doesn't resume and recompile and reboot). 26*4882a593Smuzhiyun 27*4882a593SmuzhiyunAnyway, the way to debug this for people who are interested (have a 28*4882a593Smuzhiyunmachine that doesn't boot) is: 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun - enable PM_DEBUG, and PM_TRACE 31*4882a593Smuzhiyun 32*4882a593Smuzhiyun - use a script like this:: 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun #!/bin/sh 35*4882a593Smuzhiyun sync 36*4882a593Smuzhiyun echo 1 > /sys/power/pm_trace 37*4882a593Smuzhiyun echo mem > /sys/power/state 38*4882a593Smuzhiyun 39*4882a593Smuzhiyun to suspend 40*4882a593Smuzhiyun 41*4882a593Smuzhiyun - if it doesn't come back up (which is usually the problem), reboot by 42*4882a593Smuzhiyun holding the power button down, and look at the dmesg output for things 43*4882a593Smuzhiyun like:: 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun Magic number: 4:156:725 46*4882a593Smuzhiyun hash matches drivers/base/power/resume.c:28 47*4882a593Smuzhiyun hash matches device 0000:01:00.0 48*4882a593Smuzhiyun 49*4882a593Smuzhiyun which means that the last trace event was just before trying to resume 50*4882a593Smuzhiyun device 0000:01:00.0. Then figure out what driver is controlling that 51*4882a593Smuzhiyun device (lspci and /sys/devices/pci* is your friend), and see if you can 52*4882a593Smuzhiyun fix it, disable it, or trace into its resume function. 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun If no device matches the hash (or any matches appear to be false positives), 55*4882a593Smuzhiyun the culprit may be a device from a loadable kernel module that is not loaded 56*4882a593Smuzhiyun until after the hash is checked. You can check the hash against the current 57*4882a593Smuzhiyun devices again after more modules are loaded using sysfs:: 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun cat /sys/power/pm_trace_dev_match 60*4882a593Smuzhiyun 61*4882a593SmuzhiyunFor example, the above happens to be the VGA device on my EVO, which I 62*4882a593Smuzhiyunused to run with "radeonfb" (it's an ATI Radeon mobility). It turns out 63*4882a593Smuzhiyunthat "radeonfb" simply cannot resume that device - it tries to set the 64*4882a593SmuzhiyunPLL's, and it just _hangs_. Using the regular VGA console and letting X 65*4882a593Smuzhiyunresume it instead works fine. 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunNOTE 68*4882a593Smuzhiyun==== 69*4882a593Smuzhiyunpm_trace uses the system's Real Time Clock (RTC) to save the magic number. 70*4882a593SmuzhiyunReason for this is that the RTC is the only reliably available piece of 71*4882a593Smuzhiyunhardware during resume operations where a value can be set that will 72*4882a593Smuzhiyunsurvive a reboot. 73*4882a593Smuzhiyun 74*4882a593Smuzhiyunpm_trace is not compatible with asynchronous suspend, so it turns 75*4882a593Smuzhiyunasynchronous suspend off (which may work around timing or 76*4882a593Smuzhiyunordering-sensitive bugs). 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunConsequence is that after a resume (even if it is successful) your system 79*4882a593Smuzhiyunclock will have a value corresponding to the magic number instead of the 80*4882a593Smuzhiyuncorrect date/time! It is therefore advisable to use a program like ntp-date 81*4882a593Smuzhiyunor rdate to reset the correct date/time from an external time source when 82*4882a593Smuzhiyunusing this trace option. 83*4882a593Smuzhiyun 84*4882a593SmuzhiyunAs the clock keeps ticking it is also essential that the reboot is done 85*4882a593Smuzhiyunquickly after the resume failure. The trace option does not use the seconds 86*4882a593Smuzhiyunor the low order bits of the minutes of the RTC, but a too long delay will 87*4882a593Smuzhiyuncorrupt the magic value. 88