Please register or login. There are 0 registered and 1004 anonymous users currently online. Current bandwidth usage: 326.30 kbit/s April 24 - 11:22am EDT 
Hardware Analysis
      
Forums Product Prices
  Contents 
 
 

  Latest Topics 
 

More >>
 

    
 
 

  You Are Here: 
 
/ Forums / Other Hardware /
 

  Semi-Random Reboots 
 
 Author 
 Date Written 
 Tools 
Thiago Prado Oct 05, 2015, 02:16pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List

Edited: Oct 05, 2015, 04:45pm EDT

Replies: 6 - Views: 1750
Hi everyone.

I am having an annoying reboot problem, and would love any help you can provide.

The symptons: When playing high demanding games, especially Evolve and Insurgency, the computer reboots. There is no blue screen (I have disabled windows auto-reboot on blue screens), it simply shuts down during the game, stays like that for 1 to 2 seconds, and automatically powers on again (which is why I am calling this a reboot). Now, curiously, I can sometimes play without reboots for weeks, and when this comes, it sometimes reboots every time I get to a specific game screen, and stays rebooting on that screen for some time, until it "decides" to change the moment it will reboot or stops reboting for a while longer. Low demanding games, such as Left for Dead 2, do not cause the problem, and I can play for hours without any problems.

The hardware:
1. Motherboard: Asus Z87-Pro
2. Processor: Intel I7-4770K @ 3.5 GHz
3. Processor cooler: Thermaltake water cooler
4. RAM: 16Gb (2x8Gb Red Mushkin PC3-19200 10-12-12-28 1.65V)
5. GPU: Gigabyte NVidia GTX 970
6. HDD: 2Tb Seagate (ST2000DM001-9YN1 SCSI Disk Device)
7. Disc Unit: 1xLG DVD-RW
8. PSU: Mod X Stream-Pro 700W with 80 Plus
9. Bunch of fan coolers all over


Operating System: Windows 7 Home Premium (also tried with Windows 10 with the same results)

Tests I made:

Okay, so here starts the craziness. I canīt find out a reason for the reboots, and would like to avoid changing pieces until finding the problem.

Following this post (http://www.hardwareanalysis.com/content/topic/57410/) order as much as possible

Test 1: Viruses and spyware

I have used Boot & Nuke DVD to completely erase my hard drive. Then I have re-installed Windows 7 from scratch. One of the first things I install is Norton Security, which I have a valid subscription and keep updated. I also ran all Norton scans, and found nothing. And the error continues.

Test 2: Drivers

I have tried all possible scenarios I can think of here. With drivers that came with the PC CDs, with everything fully updated, with windows 7, with windows 10. All came up with the same results.

Test 3: Overheating

3.1 GPU: Well, for starters, this is the only piece I changed in my PC trying to solve the issue. I had a GTX 780, now I have a GTX 970. Still have the same problem.

In addition, I have used Gigabyte OC_Guru to save the temperatures in a txt file until resetting. No excessive temperatures in my opinion. Here are the last lines saved prior to the reboot issue:

Date Time GPU Clock(MHz) Memory Clock(MHz) GPU Voltage(V) Power(%) GPU Temperature('C) FAN Speed(RPM)

09-17-2015 19:54:04 1316 7012 1.218 57 49 3189

09-17-2015 19:54:07 1316 7012 1.218 30 45 3197

09-17-2015 19:54:10 772 1620 0.887 14 43 3215

09-17-2015 19:54:13 658 1620 0.850 12 42 3226

09-17-2015 19:54:16 405 648 0.850 10 41 3217

09-17-2015 19:54:19 405 648 0.856 11 41 3214

09-17-2015 19:54:22 405 648 0.856 11 40 3216

09-17-2015 19:54:25 405 648 0.856 10 39 3213

09-17-2015 19:54:28 405 648 0.856 11 39 3213

09-17-2015 19:54:31 1316 7012 1.218 59 46 3162

09-17-2015 19:54:34 1316 7012 1.218 60 47 3180

09-17-2015 19:54:37 1316 7012 1.218 29 44 3184

09-17-2015 19:54:40 1316 7012 1.218 60 48 3169

09-17-2015 19:54:43 1316 7012 1.218 29 44 3190

09-17-2015 19:54:46 1316 7012 1.218 29 43 3195

09-17-2015 19:54:49 1316 7012 1.218 29 43 3198

09-17-2015 19:54:52 1316 7012 1.218 28 42 3195

09-17-2015 19:54:55 1316 7012 1.218 28 42 3186

09-17-2015 19:54:58 1316 7012 1.218 28 42 3200

09-17-2015 19:55:01 1316 7012 1.218 28 42 3197

09-17-2015 19:55:04 671 1620 0.856 15 40 3209



3.2 CPU:

I ran several stress tests, and the temperature never go much over 50 celsius. For example, here is a 15 minutes test having CPU, FPU and Cache stressed to 100%

http://picpaste.com/cpuHeatTest-JAHRhtW9.png

As you can see by the results, all temperatures are quite low, indicating the water cooler is working correctly and can stand 100% use without any problems.

3.3 Other fans: Checked all the fans. They are all working.


Test 4 RAM

I have used MemTest86 bootable version, left it executing 4 passings of all its tests, and after 10 hours 20 minutes and 23 seconds of tests, there were no errors detected at all, so I ruled out memory issues.

Test 5: PSU

Okay, so this is trickier to test. I have:
5.1: Checked the amount of watts I need here (http://outervision.com/power-supply-calculator), and I have way more than enough. It recommended 491W. I have a 700w PSU with 80 Plus, so worse case scenario I should have 560W available.
5.2: Disconnected the only thing I donīt need, my DVD-RW drive, and still got the issue
5.3: Monitored volts variation over time. I barely have any variation (tested with HWMonitor free version), when I do, it is less then 4% (12V, 3.3V, 5v). The only line that variates all the time is the 3.3V line, but it goes from 3.392 to 3.408 and back, so it seems well within acceptable ranges to me.

Theories

Well, that's the thing. I have a bunch of theories, none well supported, so I would like some help here to avoid changing piece by piece.

Can anyone see what the problem is with the above information? Or, can anyone indicate what other tests I can make without changing piece by piece, to find the problem?

Here are my theories, all actually less than good guesses:

Theory 1: Maybe the 3.3v line can variate 5%, but shouldn't have this 0.48% variation all the time, indicating PSU is the problem?
Theory 2: Motherboard is faulty and needs replacement (although I found no capacitors looking bad)?

I donīt know what else to do. I went to two physical "computer expert" stores, and both "experts" gave up, saying they didnīt know what was the problem, and the only way would be replacing piece by piece, which they canīt do since all my pieces are expensive and they are not sure they will be able to resell then if they change and find it's not the problem. Meaning, I have to take all the risk myself and buy piece by piece and change them, which obviously is a very expensive and annoying idea. Although, at this point, I am considering it.


Want to enjoy fewer advertisements and more features? Click here to become a Hardware Analysis registered user.
john albrich Oct 06, 2015, 08:29pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List

Edited: Oct 06, 2015, 08:31pm EDT

 
>> Re: Semi-Random Reboots
.
You might also consider monitoring and logging independent CPU, GPU, and RAM voltages and seeing how those change during operation. While these are derived from the PSU, the motherboard does have its own on-board voltage regulators. The on-board regulators could be defective outright, or they could be being negatively impacted by your observed PSU output variations. Programs like Open Hardware Monitor, speedfan, and so on may be help you obtain more details in your testing. Here are a couple links...there are other similar tools. Note that what works well on one motherboard may not work as well on a different brand. Some also provide a real-time logging ability while others do not offer data logging at all.
http://www.majorgeeks.com/files/details/open_hardware_monitor.html (very nice open-development project)
http://www.majorgeeks.com/files/details/speedfan.html

There are some challenges here as well. The CPU and GPU voltages and/or clock frequencies may be actively controlled and varied depending on sub-system load/demand. It MAY be possible to turn-off this "eco-friendly" software function on some devices (like a graphic card). I've run into situations where just about every time someone tries to provide an "eco-friendly" power-management capability they screw it up in a MAJOR way. The notorious ones that come to mind are the histories of hard drive power-management screw-ups that cause errors and/or massive performance degradation. I would expect CPU/GPU "eco-friendly" features would have the same challenges.

In addition, the hardware/software monitoring sub-systems are not very granular, and they operate at low sample-rates. Thus, extremely fast variations (aka glitches, spikes) may not be "seen" by a voltage monitoring program. Thus, obtaining meaningful voltage data precisely at the time your system actually "re-boots" can be difficult to pin down. Most of these programs us a "sampling" method which may be measured in seconds, so that leaves a LOT of real-world gaps for missing real-time data. This sampling issue also applies to the software system-based PSU voltage monitors you've already reported. Just because the values reported SHOWED as "OK" during your entire test, doesn't guarantee they actually were "OK" for the entire test.

In addition, many of the reported values aren't even accurate to 2 decimal places let alone the 4 or 5 decimals a monitoring program may report. People succumb to the fallacy that a "digital value" is always the truth and always exact.

Thiago Prado Oct 07, 2015, 07:42pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List

Edited: Oct 07, 2015, 08:19pm EDT

 
>> Re: Semi-Random Reboots
Hi John, thanks for your answer. It gave me a few more ideas to try.

It will take a while until I can reproduce the problem with multiple scenarios while using one of the suggested tools, such as without the energy economy features, but I will post more information as soon as I have them. Maybe with your help we can unravel this mistery! :)

Thanks!

kOrny Oct 10, 2015, 02:28am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> Re: Semi-Random Reboots
It still sounds like overheating to me. I see what you have done already to troubleshoot.

Have you monitored the temperature of the CPU while playing the demanding games? Does your BIOS have a setting to protect the CPU if it reaches a certain temp?

Thiago Prado Oct 10, 2015, 03:25am EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List

Edited: Oct 10, 2015, 04:03pm EDT

 
>> Re: Semi-Random Reboots
Hi kOrny, thanks for your reply.

Yes, I have monitored all temperatures up to the point the pc reboots using open hardware monitor, as per John suggestion. Last reboot I monitored cpu temp was 52 degrees Celsius, motherboard was 32 Celsius, and gpu 43 Celsius. All seem well within normal operation temperatures, so assuming the temp readings are correct, I don't think it would be a temp problem.

I will post last monitored values soon, I just want to test with several configurations first.

And yes, my bios has several failsafes, including cpu overheat protection. Even though I know it is unwise,I have even tried disabling the temp protections in my bios to test, and it still rebooted.

One thing I noticed with another monitoring tool, is that there is a VIN4 voltage that oscilates like crazy (between 0.016v and 1.088v from one second to the next all the time). I didn't find much about what that is, some people suggests it's the cpu voltage. Maybe that indicates a problem with the psu or the processor itself?

Anyway, tomorrow a new psu arrives and I will make more tests and post the results. I am hoping it is the psu, although so far I have no clue to what it could really be.

Thanks for your reply, and please let me know if you have any more ideas.

Thiago Prado Oct 10, 2015, 04:02pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List  
>> Re: Semi-Random Reboots
Hi John.

As per your suggestion, I have logged all data with Open Hardware Monitor. Excellent project indeed. Sadly, Speed Fan does not work with my hardware, and brings up a blue screen whenever I open it.

Regarding the eco-friendly stuff, my BIOS has 3 pre-set options: Power Saving; Normal and Asus Optimal. In theory, Asus Optimal sets everything to high performance and no power saving features. I was using normal, now I am using Asus Optimal, with the same problem.

I agree, software acquired data is unreliable, but it is the only tool I have besides changing each part until finding the problem (which I already did with the Video Card, and a new PSU should arrive monday).

Until then, here is the last data monitored by Open Hardware Monitor prior to a game reboot, which this time occured seconds after booting and opening the game. I didnīt see anything strange there, but maybe someone can identify something wrong I missed?


Time: 10/10/2015 16:32:47
Fan Control #1: 100%
Fan Control #2: 100%
Fan Control #3: 100%
Fan Control #4: 100%
Fan Control #5: 100%
Fan Control #6: 100%
CPU VCore: 0,872000039
Voltage #2: 1,016
AVCC: 3,3920002
3VCC: 3,37600017
Voltage #5: 1,016
Voltage #6: 1,9920001
Voltage #7: 0,064
3VSB: 3,44
VBAT: 3,36000013
VTT: 1,008
Voltage #12: 0,816000044
Voltage #13: 0,336000025
Voltage #14: 0,536
Voltage #15: 0,464000016
{Note: all temperatures are in degress celsius}
CPU Core: 31
Temperature #1: 30,5
Temperature #2: 28
Temperature #3: 55
Temperature #5: 36
Temperature #6: 55
Fan #1: 2020
Fan #2: 1486
Fan #3: 1082
Fan #4: 1091
Fan #5: 1137
Fan #6: 2109
{Note: I am assuming this is percentage of use per core}
CPU Core #1: 24,7956753
CPU Core #2: 31,4102535
CPU Core #3: 21,53846
CPU Core #4: 24,6153831
CPU Total: 25,5899429
{Note: Temperature, again? Using another measure perhaps? Not sure...}
CPU Core #1: 32
CPU Core #2: 38
CPU Core #3: 36
CPU Core #4: 36
CPU Package: 38
CPU Core #1: 3797,817
CPU Core #2: 3697,87427
CPU Core #3: 3697,87427
CPU Core #4: 3697,87427
CPU Package: 34,3216438
CPU Cores: 23,89456
CPU Graphics: 0 {note: not using cpu graphics, only the video card}
CPU DRAM: 7,64183855
Bus Speed: 99,94255
Memory: 31,5205917
Used Memory: 5,02428436
Available Memory: 10,9154053
GPU Core: 40
GPU: 1513
GPU Core: 1315,854
GPU Memory: 3505,50024
GPU Shader: 2631,708
GPU Core: 3
GPU Memory Controller: 0
GPU Video Engine: 0
GPU Fan: 43
GPU Memory: 10,345459
Used Space: 16,9594574


I donīt undestand all of this, but nothing seems out of the ordinary. Any ideas?

Thanks!

Thiago Prado Oct 15, 2015, 09:24pm EDT Reply - Quote - Report Abuse
Private Message - Add to Buddy List

Edited: Oct 15, 2015, 09:31pm EDT

 
>> Re: Semi-Random Reboots
Hi everyone.

Just for closure, I fixed the problem by changing the PSU, and I no longer have any problems. Sadly, it seems there wasnīt any tests that could detect that the PSU was the problem, and I was lucky that it was the second piece of hardware I changed blindly.

There was barely any voltage variation, all well within tolerance, it was not getting hot, and a friend of mine even got to test it in a little thingy that you plug the PSU and all lights were green indicating it was okay. Go figure...

FYI, I changed to an EVGA 600W PSU with 80 Plus Bronze, model 100-B1-0600-KR, +5V@20A, +3.3V@24A, +12V@49A and +5Vsb@3A, and it is doing the job nicely.

Thanks for all the suggestions, and as far as I am concerned, this one is closed. Cya!


Write a Reply >>


 

    
 
 

  Topic Tools 
 
RSS UpdatesRSS Updates
 

  Related Articles 
 
 

  Newsletter 
 
A weekly newsletter featuring an editorial and a roundup of the latest articles, news and other interesting topics.

Please enter your email address below and click Subscribe.