# Project "Headless Linux CLI Multiple GPU Boinc Server" - Ubuntu Server 12.04.4/14.04.1 64bit - Using GPU's from GeForce GT610/GT640/GTX750ti/+ to crun



## DarkRyder

tex is out on the road right now, as soon as i hear from him i'll let him know that you might need some help with this.


----------



## DanHansenDK

Hello DarkRyder









Thank you, that's very kind of you








We are so happy to hear from you in here, because we really need some help on this matter. My God, I haven't done anything else these last 3-4 weeks, than fighting this issue







We are really looking forward to hear from you again









Thanks again









Kind Regards,
Dan Hansen
Denmark


----------



## DarkRyder

np man. hope you join our team if you arent already. we'll help you as best as we can.


----------



## bfromcolo

A few thoughts, I do have Ubuntu 12.04 (not server) installed and I am able to boot and run [email protected] and BOINC remotely with no attached peripherals using SSH and Putty.

I initially set up the system with a monitor attached and did install the desktop, I wonder if that somehow changes what gets installed?

I am using the 331.20 driver. The newest driver dropped [email protected] output by as much as 40% and introduced other problems that are widely reported if you google it.

There is a timing issue in the BOINC client that causes it to attempt to recognize the GPU before the driver has been loaded on boot. Restarting the client fixes this for me, but some people delay starting the client until the driver has initialised.

Good luck!


----------



## Tex1954

HI!

Welcome to the world of Linux!

I've spent years messing with this. Ubuntu as a whole doesn't much care about BOINC and BOINC related issues. Their primary concern seems to be pushing their Client and stupid shell... I've learned to hate Ubuntu.

After trying about every version of Linux out there, I discovered several that work well with BOINC. However, Linux morphs so often that one has to decided ahead of time what they want the Linux for. After installing it and getting it running well, disable all updates unless you know for sure you need them.

If you are like me, you want it to run BOINC only. No Libre office or other useless apps, just a BOINC cruncher. Perhaps you also use VNC like I do to talk to headless systems and want the GUI for that purpose. IF that is your situation, maybe I can help. I've found ONE setup that seems to work very well. But Ubuntu only ever worked for me well using 11.10 64Bit. All the later versions crapped out.

So far, I've never gotten AMD GPU's to run properly although I've knowledge of some others using AMD GPU's with some minor success. But, I've got Nvidia GPU's running fine.

On a side note, it seems some apps run better under windows vs. Linux on CPU tasks that I have to investigate as well. Still, the majority of my CPU farm with probably run Linux without GPU's.

There are several threads here that will tell you how to get BOINC running on various versions of Linux... my suggestion is toss Ubuntu out the door and go with LinuxMint Cinnamon.

http://www.overclock.net/t/1374845/6-months-of-pain-with-linux-now-i-got-it


----------



## DanHansenDK

Hi Tex,

Thanks for getting back to me









Well, it's a rack mounted system. A lot of 2unit high computers build into a rack. The only way I control them is by SSH/Putty. I use SSH/Putty from a windows 7 rig and a few ubuntu desktops/notebooks. But all controlling is in command line (CLI in your language?







I made some scripts that control CPU's/GPU/HDD temp., and these are shellscripts run by CRON. So it's all command line. Every single command I use to control Boinc is the the command line tool boincmd .
I've been testing the system for 2 weeks now, using the desktop edition. And even though the GPU are been used, the numbers are still less/smaller than when I ran the server edition. The numbers doesn't lie . So I have to find a solution. Many from berkaley.edu has ideas, some say "use slackware" some say, use windows.. well, I contacted the ubuntu community and let them know about all the problems many of us had regarding this matter. I know my problems means nothing to them. But maybe, when people from a different "flavour" solves the issue they would wish they had joined in.. Actually I find it very disappointing, the lack of interest.

Well, I hope we can solve the problem. As I've just wrote at hardwarecanucks, berkeley.edu test system arrives the day after tomorrow and then its "all hands on deck"

This is the hardware from test 1:

Project Headless RACK Linux Boinc Servers
Ubuntu Server 12.04.4 64bit
Intel i5-3470/4Gb Ram/Asus P8H61-MX
MSI GeForceGT610 2Gb
-
Nvidia v.nvidia-linux-x86_64-331.38
BOINC v.7.2.33 x86_64-pc-linux-gn

The next system will be:

Project Headless RACK Linux Boinc Servers
Ubuntu Server 12.04.4 64bit
Intel i5-3570K/8Gb Ram/Asus P8H77-M PRO
Asus GeForceGT640 2Gb PCIe 2.0 x16
Asus GeForceGT640 2Gb PCIe 3.0 x16

Then lets see what can be done about GPU and crunching on a headless linux server, even one from ubuntu.

My answer, may be a little "off".. If it is, Ill have to admit that I'm a little tired. It's 04:41 am. I've been fighting the problem of mapping a webdav share in windows 7. But I won that one









Again, thanks for your reply








Kind Regards,
Dan Hansen
Denmark


----------



## DanHansenDK

Quote:


> bfromcolo
> A few thoughts, I do have Ubuntu 12.04 (not server) installed and I am able to boot and run [email protected] and BOINC remotely with no attached peripherals using SSH and Putty.
> 
> I initially set up the system with a monitor attached and did install the desktop, I wonder if that somehow changes what gets installed?


Hi "Bfromcolo",

Well, I've 1 Ubuntu Desktop running right now. The problems went away, after the desktop edition of 12.04 was installed. But in the last 14 days, a test has revealed, that the crunching is not as good as when the server edition was installed and then only the CPU was doing the crunching.
I ordered a new system, a better CPU, a better GPU, Asus GT640. Actually it's 2 GT640, one PCIe2.0 and one PCIe3.0. I've found some sites containing the issue, and a guru from within the windows environment actually showed me one of the best guesses on how to solve the issue.
From berkeley.edu I learned that it's possible to make a headless cruncher using slackware. But I don't wan't to choose another flavour now, just because of a little problem.
It's just that I found this post http://www.overclock.net/t/1123532/guide-gui-linux-for-boinc-how-i-do-it-done-for-now and thought they might cracked the case









Well, all my computers (not notebooks and desktops of course) are Rack-Mounted, all in 2U cases which I imported from Germany. Here's a picture 
Industrial 2U Cooler from Dynatron or JAC like this:


This is my Rack Mounted Linux Headless Boinc Crunchers. This is why I want it to be the Server Edition and all CLI:


This is the first test system which I'm running the Desktop Edition on. I'll show you the new system in a couple of days. Better CPU, 2 GPU and it will run CLI and Headless if its the last thing I do










The hardware I'm receiving tomorrow is:
Intel i5-3470K
Asus P8H77-M Pro
8Gb Kingston HyperX Genesis X2 Grey S.
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD3-L PCIe 3.0 x16

Software & Drivers:
Ubuntu Server 12.04.4
Nvidia v.nvidia-linux-x86_64-331.38
LM-Sensors v.3.3.1 with libsensors version 3.3.1
BOINC v.7.2.33 x86_64-pc-linux-gn

I'll get back when I have better news









Kind Regards,
Dan Hansen
Denmark


----------



## DanHansenDK

Hi Guys,

It's time! The hardware is here and I will be putting (NOT PUTIN!) it together now. There's some issues when joining industrial hardware and commercial hardware.
Here's one. The bracket for the 2U cooler doesn't fit on several of Asus motherboards. Therefore I've been fitting the bracket so that it, well, fits







Look at this:




Here's the stuff for Headless Linux Boinc Server v.2.0.0:



The hardware I'm receiving tomorrow is:
Intel i5-3470K
Asus P8H77-M Pro
8Gb Kingston HyperX Genesis X2 Grey S.
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD3-L PCIe 3.0 x16

If anybody knows how to install the parts of X which the GPU/CUDA uses to crunch, please don't hesitate to tell me. As you know, the Desktop Edition of Ubuntu does the trick. The problem is, that what we need is a headless cruncher where all controlling is done CLI


----------



## DanHansenDK

Hi,

OK, here we go. Just finished putting it together. The back row has been converted to Low Profile brackets so multiple graphic cards can be installed:




If anybody knows how to install the parts of X which the GPU/CUDA uses to crunch, please don't hesitate to tell me.

Kind regards








Dan


----------



## DarkRyder

looks sweet man.







nice job


----------



## DanHansenDK

Hello DarkRyder,

Thanks, thats nice of you to say









We still doesn't cracked the case! Tried several distros, in several different ways. OpenCL/CUDA, trough Wine, but the only one that works is when using GUI. I DONT WANT GUI! Not for a number crunching Rack-mounted rig









I just made a post at howtoforge, where I fell on my knee's to have some of the Linux Guru's to come to our aid. I've been trying for more than 1 1/2 month. Day in and day out.

I found a guy in the Netherlands, Mr. Gert-Jan, and he had some pretty exiting suggestions. And he did it using .deb files so that e.g. manual blacklisting of nouveau etc. wasn't necessary. But of course a new problem occurred. A warning was written a the Nvidia CUDA toolkit download site _"...*** The CUDA 5.5 Debian packages are not compatible with Ubuntu 12.04 after the 12.04.4 LTS update. Please use the .run installer instead.."_ And being the big f...... idiot that I am, the .run file needs to many manually doings!

I also tried to make a installation of Ubuntu Server 12.04.4 and then installing the minimum required packages for X to get the GPU running. This, of course, didn't work either. Well it works, for 10 minutes it looks like. The GPU temp. rises and rises and then after about 10 minutes it drops. Not sure it did any work at all.

Well, I guess it may end in a compromise. A GUI rig doing a CLI rigs job.

After making all these attempts trying to solve the issue, I'm so confused I cant remember exactly what I tried and what I didn't try. I've been doing about 430-450 setups. 5-6 diff. distros. I hope, mayby someone will feel sorry for me and show me how it has to be done









I've been working now for 23 hours straight. Needs 1 hour of rest







I'll be back...

But thank you for the kind words... Nice to know someone keeps an interest in what you are doing, right









Primary discussions:
https://devtalk.nvidia.com/default/topic/418202/cuda-programming-and-performance/cuda-working-on-ubuntu-desktop-not-on-ubuntu-server/1/
http://setiathome.berkeley.edu/forum_thread.php?id=73032&postid=1484254#1484254
http://www.howtoforge.com/forums/showthread.php?p=310905#post310905

Primary info sites:
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html
http://setiathome.berkeley.edu/cuda.php
http://boinc.berkeley.edu/wiki/GPU_computing
https://help.ubuntu.com/community/Cuda

Kind Regards,
Dan


----------



## tictoc

I haven't gone through all of your issues/threads yet, but I will throw 12.04 LTS Server on one of my rigs and see what I can come up with. I have successfully used headless nodes for CUDA rendering, so I would think that BOINCing with a headless NVIDIA GPU should be possible.

Just to make sure that I understand what you would like to accomplish, can you confirm that the following is your goal?


Install BOINC on headless 12.04 LTS Server
Use NVIDIA 640 to crunch CUDA and/or OpenCL tasks
Control headless rig from Windows 7 using SSH/Putty

First off, I would recommend controlling BOINC with BOINCTasks from your Windows 7 rig, because BOINC Tasks has a very powerful GUI, that can do everything (and more), that you would want to do using boinccmd.

I will work on this over the next few days, and see what I can come up with.

Regardless of the outcome, your dedication to getting this up and running is pretty impressive.


----------



## DanHansenDK

Hi Guys,

I did it! Ubuntu Server Edition all CLI and no GUI, CPU & GPU running Boinc or is it the other way around?! Here's the proof:



Ooops... Here it is a little larger











I will make a nice ToDo when I'm all done







There's a couple of issues yet. I still need to check if it runs headless and with 2 or more GPU's. And then there's the shell script which checks temperature of CPU, GPU & HDD. And I would like to have some fan control as well. But the biggest issue is solved









I'll be back









Kind Regards,
Dan


----------



## DarkRyder

sweet man, gj.


----------



## DanHansenDK

Hi guys,

Sorry TicTac, I didn't notice your post before right now. Sorry!!! It was 5:30 am here in Denmark, and my eyes had a funny shape









DarkRyder, thank you for those kind words









I'm so very interested in hearing what you find, because I want to make the fastest possible cruncher. I've just ordered the 3 system for testing a VM on a CLI environment like Ubuntu Server. A guy from berkeley had a great idea. So this will have to be tested too. But we did it!! We are now crunching without X-server running.

Next step. To check if it works with multiple GPU's and headless. And then I will install Aerocool hardware based Fan control. Newer mobo's like to control fans for you. This is not very good, when you are having GPU's and CPU running at 100%.
I've got a script where CPU/GPU/HDD temperature will be monitored and if something gets hot you are warned with an email, at the same time as the warning from X-vision, the fan-controller from Aerocool warns you. If something gets too hot (a limit chosen by you) the shell script run by CRON e.g. once a minute, will shut the server down. This way, if you should have a hardware defect there's 2 watchdogs keeping an eye on the system. I'm working on a little peace of electronic too. This will include the fans into the script. So that you will be warned if a fan crashes.
The 2U case I use, has 4 heavy duty fans, and the 5 and last fan which X-vision controls, will be the CPU fan of course. An industrial 2U fan which keeps the CPU at a nice 49 degrees celsius. Sensors will be placed around in the case, on the CPU of course, THE GPU's RAM, and PSU!

Lets see how it goes.

Thanks for the interest you guys, it's not everywhere you get that kind of attention









Kind Regards
Dan Hansen


----------



## DanHansenDK

Hi guys,

While doing these last things, I'll decided to order the 3 test system. This is a Asus board with 4 x PCIe 3.0 x16. Actually it has got 5, but the fifth can't be used for this. There's no room. This means that the 3 test will be a system with 4 GPU's. 4 Asus GeForce GT640 either PCIe 3.0 or 2.0. Doesn't matter in this case. It's not the bus speed I'm after, it's the GPU rendering time we like








And I will still be using the Intel i53570K CPU which has a great performance compared to the price and performance of i7-xxxx. Said in another way, it's not that much you gain using almost twice as much. In Denmark, one of the larger i7 cost about 500 US$. So should there be anyone out there who should be visiting Denmark in the near future, please don't hesitate to fill your pockets with i7's 6core for little me









This mobo looks like it hasn't got the issues from test system 1 and 2. On the flip side, there were several large disgusting component legs, with large solder points. Large as my a..







But with this mobo it looks like the bracket for the industrial 2U CPU fan is in the clear













The system will be fitted with 4 of these Low Profile Graphic Cards:



Here's a test where we can see that in windows environments the GT640 sucks! But when being put to the test on a 64bit Linux System something happens! This is why I'm using this card, and because it's the best low profile card. After all, I am trying to design a system which is not so pricey, but still does a very good job when running in a headless CLI environment











Kind Regards
Dan Hansen


----------



## bfromcolo

Very interesting project you have going on, but you seem to have chosen relatively low end GPUs considering the performance (PPD) to cost ratio of higher end GPUs running in much less expensive systems. Are there particular projects you want to crunch where this architecture is beneficial?


----------



## DanHansenDK

Hi,

And running multiple GPU's on the headless CLI based system (Ubuntu Server) also works now:

Test system 2:

# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 59 C
Gpu : N/A
Gpu : 44 C



Happy days


----------



## DarkRyder

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi,
> 
> And running multiple GPU's on the headless CLI based system (Ubuntu Server) also works now:
> 
> Test system 2:
> 
> # nvidia-smi -a |grep Gpu
> Gpu : N/A
> Gpu : 59 C
> Gpu : N/A
> Gpu : 44 C
> 
> 
> 
> Happy days


awesome man! good job, keep us up dated will all the good work man!


----------



## DanHansenDK

Hi,
Quote:


> Very interesting project you have going on, but you seem to have chosen relatively low end GPUs considering the performance (PPD) to cost ratio of higher end GPUs running in much less expensive systems. Are there particular projects you want to crunch where this architecture is beneficial?


The reason I choose this GPU is only because I can fit 4 of them in the 2U case. The card is a low profile card with 2 low profile brackets. I haven't been able to find a faster card which can endure that kind of heat at the same time.

*Jobs I'm aiming to do:*
I've got AVX in my sight.
The 64bit part of [email protected], v7 v7.01 and AstroPulse v6 v6.03
Enhanced setiathome (can't remember what its called now)

*Next test - looking for the right CPU:*
OK, now we are going to step things up, just a little bit







We are going to find the right CPU for crunching data for [email protected] and [email protected] Could be any Boinc related project. (those who offers GPU rendering too of course). We are going up 1 level! Trying the app_info.xml to define exactly which applications we want to do. In this case we will try the new AVX for the haswell CPU, we will do GPU crunching times 4 and we will do it in 64bit! Why 64bit?? Because, in Linux, when putting 64bit job to Nvidia GeForce GT640 card, something happens. It runs like its being chased by a nymphomaniac. Its no Lamborghini, but it does really crunch data in a high rate compared to the cost of it. Back to the right CPU. Here's a comparison between Intel i5-4670K and Intel i7-4770K. The results speaks for themselves:



*Intel Core i7-4770K advantages:*

In single-threaded programs, the microprocessor has 3% higher performance.
In multi-threaded tasks, the processor has 19% better performance.
Memory performance of the Intel i7-4770K microprocessor is better.

*Intel Core i5-4670K advantages:*

Based on current official prices, the CPU is 29% less expensive than the Intel Core i7-4770K processor.
This CPU has 28% better price/performance ratio.

*Facts:*
There's really not much gained in using 100US$ more, even more expensive in Denmark (Denmark is that little zit which grew on top of Germany







. This is why I choose the i5 processor instead of the i7. The i7 has 4 cores and 8 threads - the i5 has 4 cores and 4 threads. But there's not much to gain by the double number of threads. All it will do, is run twice the number of jobs at half the speed!!

And this is the hardware we will use in this 3 test system. As I may have pointed out further down/up the thread:

Intel i5-4670K
Industrial 2U Cooler from JAC
Industrial 2U PSU ATX300W
Asus Maximus VI Extreme Z87 Haswell
8Gb Kingston HyperX Genesis X2 Grey S.
Asus GT640-1GD3-L PCIe 3.0 x16
Asus GT640-1GD3-L PCIe 3.0 x16
Asus GT640-1GD3-L PCIe 3.0 x16
Asus GT640-1GD3-L PCIe 3.0 x16


Test system 1 "wellington" will now be installed the same way as test system 2 "halifax", but with Ubuntu the 12.04.4 update. The Nvidia site tells us that since the 12.04.4 update, .deb package/cuda doesn't work any more. Let's see about that







And then we will have 2 system running the Server Edition all CLI and no GUI. Headless as well and multiple GPU's running







"politics"Take that Mr. Gates







(But, thanks for donating x% of your $ pile to the less fortunate ones) "/politics"









Thanks Mr. DarkRyder. As always you Americans are positive and all for going the extra mile









Just noticed, when editing the text to correct some misbeling, that the local tim was 6:41 pm. My G..!! You must be half way around the world to me.. It's 2:46 am here








All I can say is, "...have a great afternoon, goodnight.."









I'll be back









Kind Regards,
Dan/Denmark


----------



## 2002dunx

Great project but such hard work !

The Z87-ws may be a cheaper option - one i was considering till my electricity bill arrived









dunx

P.S. My "version" of the BOINC box is an i7-4770K + Asus Impact VI with an R9 280X - so six threads, igpu and a full GPU ! mitx box with a 450W PSU.... know nothing about Linux and lack the time to learn it all from scratch.


----------



## Tex1954

Nice job so far!

I am curious to know which Nvidia driver you ended up using. In a 10 second test, I discovered the LinuxMint default driver didn't like the brand new PCIe 3.0 GTC 630 card I tested....

Also, I would really like to know why you don't like a GUI? X-server doesn't really take up many CPU's and really hasn't affected any times of tasks, but OTOH, I've never run without a GUI and desire it so I can use TightVNC to connect.

In any case, great job!


----------



## Tex1954

Another thing I would like to point out is that the only advantage to the "Server" edition is minimal and won't affect BOINC processes at all so far as I can tell. I still think you would be happier with LinuxMint Cinnamon 64b and just uninstall all the crap you don't need.

The LInuxMint folks seem to care about proper integration and testing and don't use that stupid POC Unity junk interface....


----------



## DanHansenDK

Hi Guys,

The stuff is here! We are about to launch test system 3. A pretty serious mobo from Asus and 4 x Asus GT640 LowProfile, the hardware described in the posts above









_I would really like to know why you don't like a GUI?_

Hi Tex, I have nothing against GUI. I use it all the time. Just not for a headless Rack-mounted computer system, where all control and remote control are pure CLI. I made some scripts which checks CPU temperature, GPU temperature and HDD's temperature. The last bit will not be necessary, I'm using SSD's. Tex, this is not a computer which runs Boinc, this is several computers build in a Rack, only running one thing and doing this one thing only - crunching data...All these crunchers can then be watched using e.g. BoincTasks, but it's really not necessary, theres my "CPU-WatchDog", "GPU-WatchDog" and maybe "FAN-WatchDog" shellscripts which watches the temperature and alerts if its getting hot and shutting down when getting too hot. It's also logging these things of course and mailing the alerts/warnings. I'll do a complete ToDo when I'm done.









Today, we are going to play! The new Asus mobo has a lot of nice stuff which came along when I bought it. You are not going to believe this. Maybe some of you serious gamers and crunchers already know this mobo, I didn't and I was amazed when I opened the box!! I got pictures for you guys of course









I just starting to assemble the system now, but here some pictures of the hardware we are going to use and the content of the mobo box











My G.., It's packed with all kinds of stuff! (is it kind or kinds), Well, here the 3 system, empty, prepared for the industrial PSU and several GPU's:



And here's the hardware







Well, I told you about the hardware before.. I just love to show you all the stuff lined up.











Please notice the memory modules/RAM, I found these when building test system 2, its RAM/memory modules with cooling ribs/heat sinks. And the Aerocool X-vision controls 5 fans and temperature of 5 units. In this case 1 CPU and 4 GPU's. Here's a better picture:



I forgot to tell you about the OC Panel! A remote control and LCD panel, which make it possible to remotely modify the clock speed of your CPU and it has a LCD display which shows you a number of informations of the CPU. It can be fitted in a 5.25" bracket, which fits a 5.25" bay of the computer case. It's connected to the motherboard using a cable. A cable which is fitted to the backpanel of the mobo. It's A funny little gimmick I think, but in my case it will not be used. There's no room for additional mounting in the 5.25 bays. But maybe you would like to make use of this pretty nice idea. You can see the OC Panel highlighted/marked? in red/with a red box. In the green box you can see the 5 SLI's or PCIe 3.0 slots. Great stuff











Now, lets go! We are going to build this 3 test system which will be a Headless Ubuntu Linux Boinc Cruncher with 1CPU and 4 GPU's, build in a 2U Low Profile Rack Case. A industrial CPU Fan will be used to secure a low core temperature and the GPU's has got a large Heat Sink and a Fan of their own as well. An industrial PSU will be fitted, when it arrives. It's coming from across the Atlantic and its long overdue. We will use a standard PSU instead. Luckily a standard PSU fits this Case







Let's get started









I'll report back when entering the next stage

.


----------



## Tex1954

Ummm, well, all I can say is *WOW*!

Yes, the latest HIGH END Mono's, especially the ROG type have all the bells and whistles for sure.

Those rack cases look pretty cool. Got a link for them? Looks like they stack nicely...


----------



## DanHansenDK

Hi Tex,

They do indeed







A link, right on brother, coming right up.. It's a RPS-19-2550 RackCase from Germany








I'll be back









Here it is my friend








http://www.shopwahl.de/a/produktliste/idx/2062200/mot/Rps_19_2550/produktliste.htm


----------



## DanHansenDK

Status:

It's 5:21 am here, and I need a few hours to recharge my batteries









Mobo has been fitted with CPU, memory modules all 4 GPU's.
The graphic cards had to be modified with Low Profile brackets, and this has been done!
SSD disk, has been mounted/fitted as well.

I'll be back in a few









Tex, did you find the case using the link I showed you? If not, just say so and I will find the right URL









Kind Regards,
Dan Hansen
Denmark


----------



## Tex1954

Quote:


> Tex, did you find the case using the link I showed you? If not, just say so and I will find the right URL wink.gif


Yup!


----------



## DanHansenDK

Hi guys,

OK! Done with the boring stuff... Now I'm getting at the funny part









Ubuntu 12.04.4 will still not work with .deb package of cuda 5.5 - so I'm using the 12.10 instead. Same difference








Only, I can't get the static IP to work!! ¤%&/()O)(/&%¤ I just didn't expect that much difference between the two versions







Never mind that, let's go.. I'll just guess which IP number I have to use when SHH'ing test system 3 also known as "Beaufort". Here we go!

By the way, if any of you super guru's out there, should know the right way to setup static IP for 12.10, please don't hesitate to leave a post







Please notice that I have tried ubuntu guides, the usual stuff. I would like to hear from you if you have tried the usual way and still didn't get it to work and then solved it somehow. Here's my way of doing it on 12.04:

#2 Reconfigure the network to static IP:
Command: # vi /etc/network/interfaces
File:

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
address 192.168.x.xxx
netmask 255.255.255.0
network 192.168.x.0
broadcast 192.168.x.255
gateway 192.168.x.1
dns-nameservers 8.8.8.8 8.8.4.4

#3
Command: # /etc/init.d/networking restart


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi guys,
> 
> OK! Done with the boring stuff... Now I'm getting at the funny part
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Ubuntu 12.04.4 will still not work with .deb package of cuda 5.5 - so I'm using the 12.10 instead. Same difference
> 
> 
> 
> 
> 
> 
> 
> 
> Only, I can't get the static IP to work!! ¤%&/()O)(/&%¤ I just didn't expect that much difference between the two versions
> 
> 
> 
> 
> 
> 
> 
> Never mind that, let's go.. I'll just guess which IP number I have to use when SHH'ing test system 3 also known as "Beaufort". Here we go!
> 
> By the way, if any of you super guru's out there, should know the right way to setup static IP for 12.10, please don't hesitate to leave a post
> 
> 
> 
> 
> 
> 
> 
> Please notice that I have tried ubuntu guides, the usual stuff. I would like to hear from you if you have tried the usual way and still didn't get it to work and then solved it somehow. Here's my way of doing it on 12.04:
> 
> #2 Reconfigure the network to static IP:
> Command: # vi /etc/network/interfaces
> File:
> 
> # The loopback network interface
> auto lo
> iface lo inet loopback
> 
> # The primary network interface
> auto eth0
> iface eth0 inet static
> address 192.168.x.xxx
> netmask 255.255.255.0
> network 192.168.x.0
> broadcast 192.168.x.255
> gateway 192.168.x.1
> dns-nameservers 8.8.8.8 8.8.4.4
> 
> #3
> Command: # /etc/init.d/networking restart


Ah yes... the joys of Ubuntu past 11.10 64b.... so many problems I had... I NEVER got anything working well after Unity was implemented... and in fact it hated my Realtech LAN on the Sabertooth as well.

ONLY one ever worked for me flawlessly in all respects so far was LinucMint 14... haven't tried 16 yet...

Enjoy the experience and good luck!


----------



## DanHansenDK

Hello superusers









First positive sign from system 3 aka "Beaufort":

# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 32 C
Gpu : N/A
Gpu : 28 C
Gpu : N/A
Gpu : 31 C
Gpu : N/A
Gpu : 31 C


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hello superusers
> 
> 
> 
> 
> 
> 
> 
> 
> 
> First positive sign from system 3 aka "Beaufort":
> 
> # nvidia-smi -a |grep Gpu
> Gpu : N/A
> Gpu : 32 C
> Gpu : N/A
> Gpu : 28 C
> Gpu : N/A
> Gpu : 31 C
> Gpu : N/A
> Gpu : 31 C


Congrats!


----------



## DanHansenDK

And here it is guys!!!
Up and running, crunching 4 jobs by GPU's and 4 jobs by the CPU









OK, I would have showed you an image, but I cant. No rights any more I guess









Hi Tex, thanks, well, there's a problem. All 4 GPU's runs hot. Please notice that it's exactly the same case and graphic card as in test system 2. Here's the numbers from test system 2:

# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 45 C

Both test system 1 and 2 runs at a steady 45 degrees Celsius, but the new system something happens! I have an idea, but I'm not sure at all. Anybody who has a suggestion? I thought it might be the PSU, it's not very large. But it's 350wats like the others. But then again, we got 4 GPU's running, so is it enough? What I thought was, that maybe the graphic cards doesn't get the power needed, and therefore the fan doesn't run as fast as on the other test systems!?!? I don't know very much about this, but I know you guys does! So please, let's hear what you think









Here's the number from test system 3 running 4 GPU's:

# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 56 C
Gpu : N/A
Gpu : 49 C
Gpu : N/A
Gpu : 54 C
Gpu : N/A
Gpu : 58 C

I looking so much forward to hear from you









Kind Regards,
Dan


----------



## Tex1954

Looks fine to me!


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> Both test system 1 and 2 runs at a steady 45 degrees Celsius, but the new system something happens! I have an idea, but I'm not sure at all. Anybody who has a suggestion? I thought it might be the PSU, it's not very large. But it's 350wats like the others. But then again, we got 4 GPU's running, so is it enough? What I thought was, that maybe the graphic cards doesn't get the power needed, and therefore the fan doesn't run as fast as on the other test systems!?!? I don't know very much about this, but I know you guys does! So please, let's hear what you think
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Here's the number from test system 3 running 4 GPU's:
> 
> # nvidia-smi -a |grep Gpu
> Gpu : N/A
> Gpu : 56 C
> Gpu : N/A
> Gpu : 49 C
> Gpu : N/A
> Gpu : 54 C
> Gpu : N/A
> Gpu : 58 C
> 
> I looking so much forward to hear from you
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Kind Regards,
> Dan


Looks ok to me, those temps and well within normal operating parameters.


----------



## tictoc

Quote:


> Originally Posted by *DanHansenDK*
> 
> By the way, if any of you super guru's out there, should know the right way to setup static IP for 12.10, please don't hesitate to leave a post
> 
> 
> 
> 
> 
> 
> 
> Please notice that I have tried ubuntu guides, the usual stuff. I would like to hear from you if you have tried the usual way and still didn't get it to work and then solved it somehow.


There are some very knowledgeable members in the Linux, Unix forum. You might want to look through that forum or start a new thread.









Quote:


> Originally Posted by *DanHansenDK*
> 
> And here it is guys!!!
> Up and running, crunching 4 jobs by GPU's and 4 jobs by the CPU
> 
> 
> 
> 
> 
> 
> 
> 
> 
> OK, I would have showed you an image, but I cant. No rights any more I guess
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Hi Tex, thanks, well, there's a problem. All 4 GPU's runs hot. Please notice that it's exactly the same case and graphic card as in test system 2. Here's the numbers from test system 2:
> 
> # nvidia-smi -a |grep Gpu
> Gpu : N/A
> Gpu : 45 C
> 
> Both test system 1 and 2 runs at a steady 45 degrees Celsius, but the new system something happens! I have an idea, but I'm not sure at all. Anybody who has a suggestion? I thought it might be the PSU, it's not very large. But it's 350wats like the others. But then again, we got 4 GPU's running, so is it enough? What I thought was, that maybe the graphic cards doesn't get the power needed, and therefore the fan doesn't run as fast as on the other test systems!?!? I don't know very much about this, but I know you guys does! So please, let's hear what you think
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Here's the number from test system 3 running 4 GPU's:
> 
> # nvidia-smi -a |grep Gpu
> Gpu : N/A
> Gpu : 56 C
> Gpu : N/A
> Gpu : 49 C
> Gpu : N/A
> Gpu : 54 C
> Gpu : N/A
> Gpu : 58 C
> 
> I looking so much forward to hear from you
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Kind Regards,
> Dan


Temps on the quad GPU system look good. The increased temps are just the GPUs dumping heat into the case. I wouldn't start to worry unless the temps start to go in the 80+ degree range. The only way to decrease the temps would be to put in some higher power case fans.

I would definitely recommend upgrading the PSU on the quad GPU system. Even with the low power requirements of the GT 640, I wouldn't feel comfortable BOINCing 24/7 with less than a high quality 450-550 Watt PSU.


----------



## Finrond

Quote:


> Originally Posted by *tictoc*
> 
> There are some very knowledgeable members in the Linux, Unix forum. You might want to look through that forum or start a new thread.
> 
> 
> 
> 
> 
> 
> 
> 
> Temps on the quad GPU system look good. The increased temps are just the GPUs dumping heat into the case. I wouldn't start to worry unless the temps start to go in the 80+ degree range. The only way to decrease the temps would be to put in some higher power case fans.
> 
> I would definitely recommend upgrading the PSU on the quad GPU system. Even with the low power requirements of the GT 640, I wouldn't feel comfortable BOINCing 24/7 with less than a high quality 450-550 Watt PSU.


Depends, if those are GT 610's I wouldn't worry about it as they are only 30watt cards. If however hes got 640's (especially the DDR3 version) he might be nearing the power envelope of that PSU.

EDIT: I just re-read the thread, those are DDR3 640's which means a TDP of 65watt each. I might consider a bump to a 400-500 watt power supply if you can afford it.


----------



## DanHansenDK

Hi Tex and Finrond,

OK. But they were pretty hot, when I touched them.. ooops, that didn't come out right







They felt pretty hot, and I didn't quite trust the readings.
Here, before Uni, I did a test. Removed 2 cards and left in 2, with some space between them, so that the air from the 4 large fans could reach them (or how do you say that?) Now the readings are much better:

TEST - 3 GRAPHIC CARDS 1 & 1:

[email protected]:/home/drhadmin# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +60.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +57.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +59.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +60.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +58.0°C (high = +80.0°C, crit = +100.0°C)

[email protected]:/home/drhadmin# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 49 C
Gpu : N/A
Gpu : 50 C

I just ordered a 510watts 2U PSU instead. According to Nvidia's own numbers, each Asus GT640 GDDR5 uses 49watts. 4 times 49 watts is close to 200watts, or so my calculator tells me







which leaves 100+ watts for the rest of the system. OK, it's CLI and headless, not many things needs power here, but anyway, I want to rule out the power problem. I ordered 4 extra fans which can be pointed directly at the problem. Looks like this:


Let's see how it goes. While waiting for the mail man (this is Denmark where packages arrives at soonest the next, not the states where packages arrives before the confirmations email), I will do a 1+1 & 1 test (2 Graphic cards besides each other and 1 alone - use slot 1 + 2 and 4) and then lets see what happens









Hi TicToc,

Thanks for the link to a fine forum







I'll search for some know-how in there








Quote:


> Temps on the quad GPU system look good. The increased temps are just the GPUs dumping heat into the case. I wouldn't start to worry unless the temps start to go in the 80+ degree range. The only way to decrease the temps would be to put in some higher power case fans.


OK, but they felt pretty d... hot. One time, the system rebooted - but I'm not sure if this was because of the CPU getting to hot. The industrial 2U cooler hasn't arrived yet. Not until the 22. (Maybe I should look for another supplier? They are pretty d... expensive anyway)
Quote:


> I would definitely recommend upgrading the PSU on the quad GPU system. Even with the low power requirements of the GT 640, I wouldn't feel comfortable BOINCing 24/7 with less than a high quality 450-550 Watt PSU.


Check! Done that and thank you







I chose a 510watts PSU.

Finrond:
Quote:


> Depends, if those are GT 610's I wouldn't worry about it as they are only 30watt cards. If however hes got 640's (especially the DDR3 version) he might be nearing the power envelope of that PSU.
> EDIT: I just re-read the thread, those are DDR3 640's which means a TDP of 65watt each. I might consider a bump to a 400-500 watt power supply if you can afford it.


OK, Let's try that and see what happens. I will try and test with all 4 GPU's running, before installing more fans. So that we will see what happens. If the power, which may be insufficient, may be the reason or maybe some of the reason for the problem









The DDR3 version of Asus GT640 wasn't in stock so I chose the GDDR5 version. Consumes less power, but then again, why does it do that? Because it's a sleepy fellow??? Well, I will do a comparison when ever my supplier of those cards decides to run a serious business and get both types









As the former governor of California would put it: _"....'ll be back.."_


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> Finrond:
> OK, Let's try that and see what happens. I will try and test with all 4 GPU's running, before installing more fans. So that we will see what happens. If the power, which may be insufficient, may be the reason or maybe some of the reason for the problem
> 
> 
> 
> 
> 
> 
> 
> 
> 
> The DDR3 version of Asus GT640 wasn't in stock so I chose the GDDR5 version. Consumes less power, but then again, why does it do that? Because it's a sleepy fellow??? Well, I will do a comparison when ever my supplier of those cards decides to run a serious business and get both types
> 
> 
> 
> 
> 
> 
> 
> 
> 
> As the former governor of California would put it: _"....'ll be back.."_


GDDR5 version uses less power because GDDR5 chips run at a lower voltage than DDR3 and hence consume less power. They may also have fewer chips on the board as well (confirmed, GDDR5 version is only 64-bits wide and 1GB RAM). The cards are definitely going to feel hot to the touch when they are boincing as 60c is still hot relative to human body temp, but is quite cool for a GPU. Nvidia rates the max temp for the GDDR5 version of the 640 at 95c so you have plenty of thermal headroom left (not that I would run them at that high of a temp anyway, but there is still quite a bit of headroom left).


----------



## jetpak12

I believe you have what is called the GT640 "Rev 2". On top of having GDDR5 in place of GDDR3, it also has a cut down core with fewer ROPs and texture mappers, but the shader count remains the same as the old GT640. It also got a decent stock clock boost, which should work out to better compute performance at less watts.










I have the Gigabyte version of the same card running in the BGB right now and it has been staying right at 59C for the past two days with 99% GPU usage and fan settings on auto. It also has an 8800GT sandwiched right on top of it running at full bore too, and I think temps on that guy are in the 70s (and the 8800GT fan is so much louder!







)


----------



## DanHansenDK

Hi Finrond!

Thanks man! Great to hear this, I was a little woried









OK! Let's continue! I just received this 550 watts 2U industrial PSU! I got a picture for you, but I'll dump it later on, since the picture is still on my camera SD card. It was very expensive, 1695,- danish viking crowns which is almost 250 dollars US !!
I will perform a test where I measure* the power consumed when running 2, 3 and 4 GPU's at once. And then we'll see what happens









Thanks for the help guys!! Really!

Kind Regards,
Dan/Denmark


----------



## Tex1954

Outstanding job on your systems! I "KNOW" how hard it is to get a distro of Linux running they way you have.

I did test Linux Mint 16 and found a problem that prevents it from working 100%... sigh... looks like 14 is still the best for me.


----------



## DanHansenDK

Hi Jetpack..

Thanks for the info!! I didn't know that!

Tex, is there something I can do for you? I solved several issues regarding Linux... Just say the word!!

OK, we are now testing the system with the right CPU cooler, no more hardware resetting due to a hot CPU! I got it from taiwan a few days ago. It really does it's job very good!! A nice steady 54 - 58 degrees Celsius for all 4 cores...
The new PSU 550watts is installed too. And right now I'm testing the system with 1 + 2 GPU's, to see how hot the 2 GPU's or GT640 will be, running side by side. Right now, after 10minutes of running 100%, this is how it looks:

[email protected]# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +61.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +61.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +59.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +60.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +59.0°C (high = +80.0°C, crit = +100.0°C)

[email protected]# nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 50 C (mounted alone)
Gpu : N/A
Gpu : 55 C (mounted side by side with the card beneath)
Gpu : N/A
Gpu : 51 C

The power consumed right now is 227 watts !! Not at all what I expected!!!

Regarding BoincTasks, is there anyone of you, who uses this peace of software and if, do you know what the orange color illustrates? Please look at this image:


I'll be back


----------



## mm67

Quote:


> Originally Posted by *DanHansenDK*
> 
> Regarding BoincTasks, is there anyone of you, who uses this peace of software and if, do you know what the orange color illustrates? Please look at this image:
> 
> 
> I'll be back


Orange color means that task is running as a high priority task


----------



## DanHansenDK

Hello mm67









Thank you for easing my mind









OK, we are now running 100% on the CPU and 100% on all 4 GPU's. This time, I'm using 2 x SpotCool as an image will show. Here's the results after 15minutes of running. (Have to notice that it's hot at this time in Denmark, therefore the CPU results are +10 degrees!! We are at 29 degrees Celsius indoor right now. Could use some of those american airconditions here









After 15 min. of 100% CPU and 100% GPU's:

[email protected]# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +105.0°C)
temp2: +29.8°C (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +71.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +70.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +71.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +67.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +63.0°C (high = +80.0°C, crit = +100.0°C)

[email protected]## nvidia-smi -a |grep Gpu
Gpu : N/A
Gpu : 48 C
Gpu : N/A
Gpu : 52 C
Gpu : N/A
Gpu : 50 C
Gpu : N/A
Gpu : 52 C

I will take a picture and show it here. I hope some of you guys have an idea, how to solve the cooling problem. Maybe replace case fans with super duper fans???? Or ad some, somewhere???? Just don't know where to put them. There's room for plenty of fans, just not in the right spot!?!?

OK, here's the monster 2U PSU. Not monster in watts, but monster in costs. I paid about 1.698,00 Viking crowns, I think it was. I US steel, this is almost 240US$ !! plus taxes... My G.. !! Well, here it is.. Doesn't look anything out of the ordinary










And here's the toys we are going to try and cool it with. 2 x SpotCool and a watt meter to see if this special 2U 550watts PSU was really necessary. These first readings tells us, it was NOT necessary. But, lets see when it's all up and running. It's not a bad thing, if we can build a system, which can do what this does, and still doesn't use more power, than an ordinary desktop computer running the same stuff.


Well, here's the third test system, installed at last, with the right CPU cooler and all 4 GPU's running. The SpotCool fan does a really great job! I've got no data yet, but this will be log'ed during the night (UTC +2) Oooooooops! 01:49 am. Have to sign of now!! The clock sings out a tune for me at exactly 05:00. Nighty-night friends!!
Here' the system as it is running right now:




By the way, this is the next motherboard I'm going to test! Waterproof and everything. 1000 Viking crowns cheaper than this mobo, so this is pretty interesting...
The very expensive mobo from Asus (Extreme IV) which is used in test system 3, sadly had components and legs from components placed directly in the path of the metal/mounting bracket for the 2U industrial CPU cooler. So I had to cut metal again!! Not very funny when using hardware at these costs!! Same problem with test system 1 and 2 !! This I find very hard to understand! Asus, a really wonderfull manufacturer, placing stuff right where the CPU bracket of an 2U cooler from CoolJAC is placed. I just don't understand it! It's not a problem to solve this, not at all. The software used to lay out the connections on the motherboard or the circuitboard which may be a better word, can and will solve this issue in less than a micro second. Well, lets se if ASRock solved this issue and if there is an issue at all.



Only difference regarding the SLI is this:
_- 3 x PCI Express 3.0 x16 slots (PCIE1/PCIE2/PCIE4: single at x16 (PCIE1); dual at x8 (PCIE1) / x8 (PCIE2); triple at x8 (PCIE1) / x4 (PCIE2) / x4 (PCIE4))
- 1 x PCI Express 2.0 x16 slot (PCIE6: x4 mode)_
But this is not a problem at all, because we are using PCIe2.0 cards anyway!! I'm looking forward to see the board, when it arrives!







02:11 am, night!

See you soon friends









Kind Regards,
Dan


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Tex, is there something I can do for you? I solved several issues regarding Linux... Just say the word!!
> 
> )


Well, the problem stems from the Ubuntu base that Linux Mint is built on.

First problem is later versions stopped faking the EDID info when a VGA dummy plug was used This causes all sorts of problems with Xorg drivers. The simple fix is for me to get a used 15" - 17" monitor that supports EDID and use it to install things... It seems this problem only happens on systems where the Mobo has no built in display support. I generally have zero problems with Mobo's that do have built in support . This is a known and designed in feature that will never be changed.

The second problem has a work-around (sudo gedit)... that being the "chmod" command to add permissions so I can access the BOINC directories doesn't seem to work properly in LinuxMint 16... A search of the message boards reveals they are aware of the problem but will not fix it now because version 17 will release soon.

The third problem is I am tired of Ununtu and their tons of buggy releases that always have bugs that never get fixed and sometimes they break things that used to work fine!

I'll be looking into other Linux distro's again soon. For now, LM-14 works fine so I will keep it until something better arrives.

Anyways, after tons of meds and other things going on in my old body at the moment... I'll finally be getting back on the road to pay bills in a day or so...

I could use a new cloned younger body... just transplant my brain into it...

LOL!


----------



## DanHansenDK

Hi Tex,

EDID, isn't it mainly for windows OS? EDID stores information about resolutions and other settings right?
Quote:


> "...VGA dummy plug was used This causes all sorts of problems with Xorg drivers...."


I'm pretty sure we can find a solution Tex! These are exactly the things I've been fighting. Maybe there's a solution to your problem in my toolbox







MINT is based on Ubuntu, right? Debian packages must work on MINT too. What's the package manager called on MINT?
Please describe the problem a little better for those of us, who are not that quick








Quote:


> The third problem is I am tired of Ununtu and their tons of buggy releases that always have bugs that never get fixed and sometimes they break things that used to work fine!


Right on brother!! I hear you loud an clear!! I've got a perfect example! Found a way to make my headless server very early on, but after the 12.04.4 update, cuda5.5 .deb didn't work anymore!! What a ..... of horses... ;( Pardon my French
Quote:


> I'll be looking into other Linux distro's again soon. For now, LM-14 works fine so I will keep it until something better arrives.


Maybe we should go "debootstrap" or debian even, maybe??? I've got a friend at berkeleys, his trying to talk me into chosing another distro all the time. Well, All of my problems were solved one by one, using debian solutions. Maybe it's the way forward. Let's do the research







I'll keep it in mind, that's for sure...
Quote:


> Anyways, after tons of meds and other things going on in my old body at the moment... I'll finally be getting back on the road to pay bills in a day or so...


Hey, what are you doing by the way? Just for the record









Quote:


> I could use a new cloned younger body... just transplant my brain into it...


Well, we are building robots at DTU (MIT-->DK), but we haven't reached the point where brain transplantation is possible! I will let you know, the moment it happens









Hear from you soon I hope


----------



## DanHansenDK

Regarding Test System 3 (Beaufort),
these are the numbers after 24 hours of running at 100%:

Code:



Code:


[email protected]# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 49 C
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 48 C
        Gpu                         : N/A
        Gpu                         : 50 C

[email protected]# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +27.8°C  (crit = +105.0°C)
temp2:        +29.8°C  (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +67.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +67.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +65.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +63.0°C  (high = +80.0°C, crit = +100.0°C)

Power consumption is still at low 227 watts!! This is very good news I think! Well, maybe not that good news anyway







I did use a pretty serious amount of viking crowns on this 2U industrial 550 watts PSU!!! So it's not all good









OK, let's see what happens. I've been writing my "dealer" and asked them to find a fan which runs at twice the speed as these standard 2U case fans !! These fans run at a boring 2100 (what's the word?) #¤%&/()(/ "rotations"??? a minute.

I'll go on with the shell scripts GPUWatchDog, CPUWatchDog and FanWatchDog which is certainly needed now !!!
We have one little issue regarding the fans. Because of the drivers installed, to make the headless server work (any Nvidia driver), the fan control in LM-sensors doesn't work any more! This we will have to solve before a script can be done, so if any of you brainee's out there has a suggestion, I'll be more than happy to hear it









See you soon guys


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> I'll go on with the shell scripts GPUWatchDog, CPUWatchDog and FanWatchDog which is certainly needed now !!!
> We have one little issue regarding the fans. Because of the drivers installed, to make the headless server work (any Nvidia driver), the fan control in LM-sensors doesn't work any more! This we will have to solve before a script can be done, so if any of you brainee's out there has a suggestion, I'll be more than happy to hear it
> 
> 
> 
> 
> 
> 
> 
> 
> 
> See you soon guys


Okay, I think you are talking about the fan control for the GPU's??? If so, the GPU firmware controls them well enough... and GPU's can run up to 80c all day long without any problems...

The CPU fan control is configurable in the Mobo BIOS too... so you can twiddle with that if needs be...

I see no problems in your temp readings if that is under full load..


----------



## jetpak12

Quote:


> Originally Posted by *DanHansenDK*
> 
> Power consumption is still at low 227 watts!! This is very good news I think! Well, maybe not that good news anyway
> 
> 
> 
> 
> 
> 
> 
> I did use a pretty serious amount of viking crowns on this 2U industrial 550 watts PSU!!! So it's not all good


I believe PSUs tend to run most efficiently at around 50% load, so maybe not all is lost!









And temps look fine to me, like Tex said. Can you see what the fans are blowing at to give those temps? The GPU BIOS should ramp them up automatically as the temperature increases to keep things under control for continuous running.

Also, (and more importantly







) it looks like you have a stack of three systems, why is this not 3 x 4 GPU => 12 GPU mega-rig!


----------



## DanHansenDK

Hi Tex,

Well, OK... The temp. is all right now, yes it is... But, when closing up the case I'm pretty sure, all the heat generated from the GPU's will increase the case temp. You know, a kind of loop? I don't know the English phrase for it, but in Danish we call it "a bad circle"
I need some more cooling. Not larger heat sinks, but better airflow through the case









Regarding your problem, could this be of any help to you?
http://www.playtool.com/pages/dvitrouble/dvitrouble.html

*Solutions: rewrite the EDID data*
_The video card reads the EDID data from the monitor to get the capabilities of the monitor. Unfortunately, some monitors have less-than-stellar EDID data. Some LCD-TVs which are supposed to be compatible with computers through a DVI connection understate their maximum resolution in the EDID data. It doesn't make sense, but it has happened. In some cases, you can update the EDID data in the monitor by running a program on your computer. Most monitors can't do this but it's worth checking the monitor manufacturer's web site to see if they have any EDID patches for your monitor. These usually affect the maximum resolution which can be displayed but can sometimes solve other problems as well. If you know quite a bit about your monitor specifications then you can check your EDID data to make sure it matches what you know about the monitor. ViewSonic provides a handy program called EDID.EXE which can read your EDID data and display it in a readable format. You don't need a ViewSonic monitor to use it._

Kind Regards,
Dan


----------



## Tex1954

I found another cure by editing the Xorg.config file and putting in some dummy values to force it to work... so far so good..










PS: On the road again soon... Tomorrow looks like.


----------



## tictoc

Nice work on forcing Linux to do what you want.









Be safe on the road, and hopefully all your gear keeps crunching through the Pentathlon.


----------



## DanHansenDK

Hi Tex,

Great! Because, it really sounds like it's some of the same issues we got/had








Looking so much forward to receiving those new fans.. I hate to have to drop one GPU due to the heat issue..But, again, this is Denmark where snails reach the finish line before packages









Hi TicToc,
Thanks my friend







OK, I'm not quite sure what it is, but I can see your link 5th BOINC Pentathlon - May 5th - 19th, 2014, and it must be some kind of competition or a day where everybody gives everything. Well, I turned on everything!!! And I mean everything!
I've been tied up at school and in my business, so I'm a little behind... Sorry for that guys. I'll try to get updated ASAP..

Have a nice one all of you!

Kind Regards,
Dan/Denmark


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Tex,
> 
> Great! Because, it really sounds like it's some of the same issues we got/had
> 
> 
> 
> 
> 
> 
> 
> 
> Looking so much forward to receiving those new fans.. I hate to have to drop one GPU due to the heat issue..But, again, this is Denmark where snails reach the finish line before packages
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Hi TicToc,
> Thanks my friend
> 
> 
> 
> 
> 
> 
> 
> OK, I'm not quite sure what it is, but I can see your link 5th BOINC Pentathlon - May 5th - 19th, 2014, and it must be some kind of competition or a day where everybody gives everything. Well, I turned on everything!!! And I mean everything!
> I've been tied up at school and in my business, so I'm a little behind... Sorry for that guys. I'll try to get updated ASAP..
> 
> Have a nice one all of you!
> 
> Kind Regards,
> Dan/Denmark


You running TheSkynet POGS? thats the current project for the competition.


----------



## DanHansenDK

Well hellooo friends !!!

After a little "vacation" (summertime in Denmark) I'm back.

Last time, I was looking for a faster/high-end fan for the 2U Rack mounted case. 3 x Asus GT460 works just fine, but when a 4'th is added, the heat in the case gets very high. This because of the little space which is left in the case, and because the hot air in the case is not blown out of the case.

I managed to find these high-end fans, which I ordered yesterday:
*Vantec Tornado 80mm Double Ball Bearing High Air Flow Case Fan - Model TD8038H*
_Vantech's Tornado TD8038H 80mm black case fan is a powerful cooling solution for today's desktop computers. The dual ball-bearing design extends the life span of the fan and keeps it spinning at *5,700rpm*. It easily installs in a wide variety of desktop cases and yet is capable of pushing more than three times the air than a standard 80mm case fan-whether it's used to pull cool air into your system or push hot air out._


*A new mobo will be used*
And because the board from Asus is so expensive, I had to find another board, just to keep the costs down. 2.500 kr. (US$ 358,-) for a mobo is OK, if you only need one. But when you are bulding a system, which you are going to make 10 or 20 pcs. of, then we'll have to keep the prices in mind. Therefore I've been looking for another board, which does the same and costs less. I wrote about this mobo some time ago, but here it is again. The mobo from ASRock (had have nothing but positive experiences with mobo's from ASRock) cost only half as much as the mobo from Asus. Well, the mobo from Asus comes with all sorts of nice accessories, but we don't need this for this project.

*ASRock Z87 OC Forula*
http://www.asrock.com/mb/Intel/Z87%20OC%20Formula/


So, in this test we'll use the mobo from ASRock. The new fans will be tested too. Let's see if we can get the heat in the case down, using all 4 Low Profile Asus GT640


----------



## magic8192

You seriously have the BOINC bug








I am considering running linux on my dedicated BOINC box.


----------



## Tex1954

That is serious stuff... and serious electric bill too..


----------



## DarkRyder

Quote:


> Originally Posted by *Tex1954*
> 
> That is serious stuff... and serious electric bill too..


i think i should change my screenname to "Electric Bill" lol


----------



## DanHansenDK

Hi Magic 8192, Tex and DarkRyder









Quote:


> I am considering running linux on my dedicated BOINC box.


Well, I'll poste my Todo here. You guy's will be the first to know. You have shown the most interest, so it'll be here, I'll post it.
I was waiting for Ubuntu to fix the bug, that appeared when upgrading to version 12.04.4, but it doesn't seem to come!?!?
Therefore, I'll show the ToDo for running a Headless Linux Boinc Server with multiple GPU's crunching







The Linux version will be Ubuntu Server 12.10 ...
Quote:


> That is serious stuff... and serious electric bill too..


Hi, Tex...
Yes, you are quite right my friend







I'm using about 2 or 3 times the amount of Kw now, that I was using before.. That's why I'm trying to make the system, "headless", using SSD's, etc. It's not much, but maybe all together it will mean that we'll save a little








Before running Boinc Systems, I had a e-bill on/of/at* about US$ 100,- a month. My last bill was US$ 308,-
* can't remember the right word
Quote:


> i think i should change my screenname to "Electric Bill" lol


Yeah, it's not the hardware, it's the electric that's costly







But then again, thats why I'm attending DTU in denmark for 3 more years this winter. I'm trying hard to find a way to make "green" power" at home, without having to buy a costly system..

Well, I'll show the things for this test beneath. Got the last part yesterday! The special fan, which we have to use.
Mobo: ASRosck Z87 OC Formula
CPU: i5-4690K
Memory: 2 x 4Gb Kingston HyperX DDR3 1866MHz
GPU's: Asus GeForce GT640 Low Profile
Harddrive: SanDisk Ultra 60Gb SSD
Other stuff: AeroCool X-vision Fan/temp.-control

Now, what's most exiting, is if we need to "cut" the mounting bracket for the industrial cooler/fan like we had to using Asus mobo! The bracket didn't fit in any of the 3 models of Asus mobo's. I'm a little nervous, because I was able to fix it using the Asus mobo's. But what if this mobo is even worse designed? (Not that the mobo's are badly designed, they are just not designed for industrial use) Hope I doesn't offend anyone!!!

Forgot to tell about the reason, I chose AeroCool X-vision Fan/temp. control for this system. X-vision has got 5 temp. sensors and 5 fan controls. This means we can monitor 5 places and control 5 fans. The reason why this is so perfect, is, that we have 1 CPU and 4 GPU's !! There's 1 industrial CPU cooler and 4 additional Case fan's !! This way we can monior all working Processors and the fan's that will keep it all cool!! I'll show it a little better later on.






Let's begin.....
I'll take picutres along the way.. And I'll post the ToDo of making a "Headless Linux Multiple GPU Boinc Server" too









Time: 13:59
Checking the mobo for space. To make sure, that the reinforcing bracket for the industrial cooler will fit !!



Time: 14:11
Close, but no cigars !!!! Let's take a look at the problem. It was very near, but the there is still the same issue. Not as many places of course, but never the less, we will have to "cut" the problem away! It's a lot better than with the Asus mobo, but still, we'll have to use hard measures to make it work and I hate to use tools on a peace of electrical hardware.
You can see the issue here. At the first image you can se how the mobo is, when using a standard cooler from e.g. Intel. At the second image you can see the reinforcing bracket for the industrial cooler and the problem there is when trying to fit it on/in* to the mobo.


It's possible, that we can "cut" the leg/pin directly on the mobo, but I don't like that idea very much. Why? Because the bracket can be replaced, the mobo can't








I'll go into my workshop and start working on the mobo/bracket, so that we can get on with it







OK, it's irritating, but it's a lot less work than with the Asus board, so we'll live


----------



## Tex1954

Yikes that bracket is so close... should be easy to grind some material off...

Dremel to the rescue!


----------



## magic8192

Very nice setup. Are you going to Overclock the i5-4690K?


----------



## DanHansenDK

Hi Guys,
Quote:


> Yikes that bracket is so close... should be easy to grind some material off...
> Dremel to the rescue!


Tex is so right! Dremel is exactly what I used to "kiss" the bracket. I don't now what it's called in english/american, but a grinder has got a spinning disc and this is what I used. Just a miniature version of it.
After grinding/cutting the little bit away/off, I remove any small metal pieces with a sharp scalpel or knife and then I seal it with nail-polish (the stuff girls has on their fingers so that there will be no short-cut's/glitches!!



Quote:


> Very nice setup. Are you going to Overclock the i5-4690K?


Hi Magic, No, no overclocking! Why? Because it's not so much datacrunching we gain by overclocking I guess. If I'm wrong, please correct me.
As you can see, I'm using the i5 CPU instead of the i7. This because theres only 0.3 % to gain by choosing the i7 CPU! I showed the tests earlier on in this post. It would run twice as many jobs, but at half the speed. The reason I don't overclock it, is that I'm trying to build a system which do not run hot, doesn't get to close to the CPU heat shut-down limit. And by running Linux/Boinc, the CPU will be used 100% 24/7 without any breaks! I'm pretty sure theres not much to gain by "tuning" this system. I get why gamers overclock! Indeed I do, but in this case, I think it would only add to the pile of issues









I made a little image so that we can see the little difference between the i5-4790K and i7-4670K CPU.:



OK, I just noticed something! The i5-4690K is not, I repeat NOT a Z87 CPU... For a second or a little more than a second, I froze! Thought that the ASRock Z87 OC Formula mobo maybe wouldn't be compatible. The i5-4690K is a "Devil's Canyon" CPU architecture. But after reading a little, I learned that it should work just fine! That was a close one, to close! Let's test it









*Please help me with this Guys







!*
1. Do you think it's best to test the system, using the "old" fans in the case (the original), measure it and then change the fans. To compare the results!?!? Or do I change to the new case fans right away to see if it runs with 4 GPU's/GT640 without overheating!?!?
2. Do I remove as much as possible from the back of the case, to get a better flow???

_(please notice this is the old image. but it shows the issue pretty well. I had to add fan's, and couldn't close the case)_



.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic, No, no overclocking! Why? Because it's not so much datacrunching we gain by overclocking I guess. If I'm wrong, please correct me.
> As you can see, I'm using the i5 CPU instead of the i7. This because theres only 0.3 % to gain by choosing the i7 CPU! I showed the tests earlier on in this post. It would run twice as many jobs, but at half the speed. The reason I don't overclock it, is that I'm trying to build a system which do not run hot, doesn't get to close to the CPU heat shut-down limit. And by running Linux/Boinc, the CPU will be used 100% 24/7 without any breaks! I'm pretty sure theres not much to gain by "tuning" this system. I get why gamers overclock! Indeed I do, but in this case, I think it would only add to the pile of issues


The biggest argument against overclocking a 24/7 cruncher to me is the power consumption and heat. It is almost a linear performance gain. A 20% overclock would probably provide a 20% performance gain.


----------



## Tex1954

Quote:


> Originally Posted by *magic8192*
> 
> The biggest argument against overclocking a 24/7 cruncher to me is the power consumption and heat. It is almost a linear performance gain. A 20% overclock would probably provide a 20% performance gain.


I HAVE to jump in here and say with authority....

Magic is right but didn't tell it all...

1) *NOT TRUE* HT doesn't benefit you much depending on the project WU makeup. For instance, this new Universe project... it runs avg 14 minutes per WU on 2700K 4.5GHz. The SAME tasks running on 4670K 4.2GHz take 11 minutes. Call it 10 minutes if I could run it at 4.5GHz... (waiting for new heatsink to do that!) Also included is compare to 1045T @ 3.25GHz...

2700K: 1440 / 14m * 8t = 822 Wu/Day
4670K: 1440 / 10m * 4t = 576 Wu/Day
1045T 1440 / 21m * 6t = 411 Wu/Day

For Universe tasks, HT is a *huge* benefit.

2) Now, for OTHER projects, the tables turn like Edges....which is more equal....

2700K: 1440 / 10m * 8t = 1152 Wu/Day
4670K: 1440 / 5m * 4t = 1152 Wu/Day
1045T 1440 / 9m * 6t = 960 Wu/Day

I could go on, but you see my point. Some coding mixes (compares, memory fetches, FPU operations, etc.) give benefit to HT or not... and also to AMD vs. Intel or not... SIMAP favors AMD in the way HT favors Universe! (Too bad SIMAP going away...)

Anyways, so far as BOINC goes, perhaps half (or more) the CPU projects tend to equalize HT or not... but there are a high percentage of exceptions...


----------



## DanHansenDK

Hí Tex and Magic,

Thanks for the facts!! In this case, I think it's better to solve the heat issues and then find another Low Profile card, which is faster. So far all I've been able to find, is a ATI card. But the difference is only 5%. But, to gain 5% per GPU (times 4) all ready exceeds what we gain overclocking the CPU. I totally get it, to "squeeze" all that's possible out of the CPU's !! It's just that there's a reason for the setup/configuration of the CPU's. We need to make a system which will last. I've already succeeded in making this system running, using all 4 GPU's. But the temp. gets close to maximum. This is why I'm trying to find a hardware configuration which will decrease the heat inside the case. So that we are not close to the limit all the time.
I've got a test bench, AeroCool X case and the Asus Maximus Extreme board, where there is a CPU overclocking device (a kind of a remote control), so that you can change CPU speed by pressing a button. So I do overclock, using this bench, but when building a system which has to run 100 % 24/7/365 for 5-10 years built in a Rack, I think it's better to follow the standard settings. In this case, anyway.

Am I talking total nonsense?

Tex and Magic, Please help me with this....
1. Do you think it's best to test the system, using the "old" fans in the case (the original), measure it and then change the fans. To compare the results!?!? Or do I change to the new case fans right away to see if it runs with 4 GPU's/GT640 without overheating!?!?
2. Do I remove as much as possible from the back of the case, to get a better flow???

(please notice this is the old image. but it shows the issue pretty well. I had to add fan's, and couldn't close the case)



.


----------



## Tex1954

Now you know why data centers have such high A/C bills and use low power stuff...

In any case, The fans you use will make a difference I'm sure. I presume you want to close the case and let things go...

Well, I would do one step at a time, take measurements and go to next step. Also, where the air flows in the case is at least as important as how much...

Best to do all testing in final configuration IMHO...


----------



## magic8192

I would want a baseline with the old stuff, but it is really just a personal preference.
There is another thing you can do besides overclock the CPU. You can lower the voltage of the CPU to the minimum stable. You can also underclock the memory and even lower the voltage on the GPU. Some projects don't mind lower memory clocks on the GPU. This can reduce the power consumption and heat of the system with little or no drop in computing power, depending on the project.


----------



## DanHansenDK

Hi Tex and Magic,

Quote:


> In any case, The fans you use will make a difference I'm sure. I presume you want to close the case and let things go...


Hi Tex. Yes, those CoolSpot Fans was a test, to see if the temp. could be lowered. The problem is, if there's not much space and 1cpu + 4 gpu's running, the heat will just increase and increase if it's not "removed" A kind of a "bad circle" or what it's called.
Quote:


> Well, I would do one step at a time, take measurements and go to next step. Also, where the air flows in the case is at least as important as how much... Best to do all testing in final configuration IMHO...


I agree! BTW what does IMHO mean? Something like in my opinion? Sorry







I'm not quite up to date with the "online talk"









Hi Magic,
Thanks for the info/ideas







When the heat issues and setup/configuratin issues has been solved, maybe these thing could be worth working on! To "tune" the system for maximum results and minimum power consumption!?!?

*QUESTION:*
I've only got 2 x 4Gb Kingston DDR3 Memory and I was wondering, isn't 1 x 4Gb enough? It's Linux and Boinc only. I had older systems running on 2Gb earlier on. I'm asking, because to make DCMT (Dual Channel Memory Technology" work, there has to be a pair in the memory slots. But, then again, what's to gain securing DCMT ????

*STATUS:*
I'm currently mounting stuff on to the mobo. As we talked about, we are going to test the system using the old fans first. Then we can se the difference. It's the 4'th graphic card/GPU which is the problem! This is why Test system 1 had 1 graphic card, test system 2 had 2 graphic cards etc. etc.
Well, let's go









*TEST:*
"Headless Linux Multiple GPU Boinc Server". Test system 4. A new brand/mobo Asus Z87 OC Formula, Intel i5-4690K*, 4 GPU's (Asus GT640 Low Profile)



* _OK, I just noticed something! The i5-4690K is not, I repeat NOT a Z87 CPU... For a second or a little more than a second, I froze! Thought that the ASRock Z87 OC Formula mobo maybe wouldn't be compatible. The i5-4690K is a "Devil's Canyon" CPU architecture. But after reading a little, I learned that it should work just fine! That was a close one, to close! Let's test it wink.gif_

I may have to update the BIOS to make this system work. When I chose the CPU, I made a mistake! I chose a wrong CPU architecture. I thought it was right. i5-4670 and i5-4670K was Haswell, but i5-4690K wasn't . But, when using BIOS version 2.1 + the "Devil's Canyon" architecture is supported
















_BIOS 2.10 - 7/22/2014 - Instant Flash - 4.69MB
1. Update InstantFlash Module.
2. Update Intel(R) ME.
3. Support i7-4790K, i5-4690K and Intel® Pentium® G3258 EZ OC.
4. Adjust "Intel(R) Smart Connect Technology" setting._


----------



## Tex1954

Wow! *In My Humble Opinion*, you are working too hard and too late... definitely got the bug I think....

LOL!

You asked a lot of questions... well, you surely don't want to experience thermal runaway and about the only cure for that would be a case mod or change. Fact is, a 2U or 3U case may work better for you.

Most projects take little memory, others take 500Meg or more (NFS) up to 10Gig or more (Lattice) per WU. There would be a few projects you simply could not run and you would learn fast what those were when the virtual HD started thrashing. IF it was up to me, 2Gig per core would be nice... but likely most projects would do fine with 1Gig per core...


----------



## DanHansenDK

Hi Friends,

And then everything went black!!

Never in my life have I seen a worse manual!! 1 Line in the manual tells the user, that he/she has to connect this 4 pin HD connector to directly to the motherboard. I have never seen this in all the years I have been making computers. So, of course 1 line on page 4865 is more than enough. S..... M........... !!! /&%¤#¤%&/()

I'm really really sad right now! Thousands of dollars, multiple days used, and then just because of a connector which is placed by a stupid engineer, I can't go on with the test. The problem is, the connector is horizontal and there's no room for a horizontal connector. I've only got about 11-12 mm from the connector on the mobo to the edge of the 2U industrial PSU !?!?!?

Here's an image of the connector. I learned by searching the web, that this connector has to be used, if you install a graphic card. Well, of course we wan't to install a graphic card. Why would we buy a mobo like this, with 4 x SLI if we didn't want to install a graphic card.
The text/info in the manual is, _"connect a hd connector to this if you install 2 graphic cards on this motherboard"_ !!!!! Well, do they mean 2 graphic cards in total? More than 1 card? Is the onboard graphic card included in _"..2 graphic cards..."_



Anybody who has got an idea???? If, please don't hesitate to let me know. I'm pretty down right now


----------



## Tex1954

Well, the PCIe power specification is what they were worried about when they did that. It's just an aux connector to supply the Mobo PCIe power buss with additional poop for multiple cards. It has nothing to do with the onboard graphics...

Thing is, your cards are LOW power to begin with and MAYBE don't exceed the specs when 4 installed.

In any case, that is a ***** of a problem... If it was me, I would put a DVM on that connector and install 1 board at a time and look for a substantial voltage drop to see if the buss is being overloaded....

Other than that, there ARE right-angled molox connectors that would take up little space after installation... Look for an adapter cable maybe...


----------



## DanHansenDK

Hi Tex,

Thanks for your prompt response








Quote:


> Well, the PCIe power specification is what they were worried about when they did that. It's just an aux connector to supply the Mobo PCIe power buss with additional poop for multiple cards. It has nothing to do with the onboard graphics...


Well, then I'm, totally lost! I was sure that this was the reason the system will not boot or even go to BIOS !?!? Then I thought, well, this is maybe what they were talking about, and all I need to do is to update the BIOS to version 2.0 . But, if this was the problem, then how am I suppose to update the BIOS, if I can't even access BIOS !?!? I tried everything, other memory, change/check control/pwr/hdled/reset settings, changed PSU, keyboard etc. etc. But, I know it's non of these things, because the onboard speaker would have told me so. Nothing happens. Only installed 1 graphic card. but the screen is black and will not access BIOS. Z87 / "devil's canyon" conflict???
If, how do you update the bios without accessing the BIOS/boot-up ???

http://www.asrock.com/MB/Intel/Z87%20OC%20Formula/index.asp?cat=Download&os=BIOS

BIOS v.2.10 2.10. 7/22/2014. Instant Flash. Support i7-4790K, i5-4690K and Intel® Pentium®

Quote:


> In any case, that is a ***** of a problem... If it was me, I would put a DVM on that connector and install 1 board at a time and look for a substantial voltage drop to see if the buss is being overloaded....


OK.. That's a good idea. Then I can see when ever the voltage is not enough. But right now, we need it to access BIOS









Maybe I should try and reinstall the CPU ???
http://www.overclock.net/t/1415886/problem-booting-with-asrock-z87-oc-formula
Do you think the problem is the CPU is a i5-4690K ??? And that I need to update? If this is the case, wouldn't I be able to access BIOS even though the CPU wasn't yet supported??

Thanks for the help


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Do you think the problem is the CPU is a i5-4690K ??? And that I need to update? If this is the case, wouldn't I be able to access BIOS even though the CPU wasn't yet supported??


You can also try pulling the Motherboard out of the case and connect the molex power just to see if that is the problem. If you hook up the molex power and pull all the add oncards out of the system and it will not boot, then it is most likely the bios. Your options are pretty limited here. You can try to get ASRock to send you one and cross ship it, or you can possibly go to a local shop and get them to help you?


----------



## Tex1954

Hmmm, well, next thing to do is pull the battery and short out the contacts for 15 seconds to make sure the BIOS is reset to defaults and try to boot again with NO graphics card installed... using onboard graphics only. BE SURE to hook up a monitor that supplies EDID info, not some old crap like I have because I have discovered that a LOT of those NEW linux base fancy BIOS setups need that EDID info.


----------



## DanHansenDK

Hi Guys,

Thanks for helping me here







I'll look into it right away! First time ever having these problems.. I'll guess on wrong CPU/or miss-fitted CPU. If not, then this power connector for extra graphic cards. Let's see









Quote:


> ....then it is most likely the bios. Your options are pretty limited here. You can try to get ASRock to send you one and cross ship it, or you can possibly go to a local shop and get them to help you?


Hi Magic, a new BIOS? But the new BIOS I can DL. But how can I update the BIOS without accessing the BIOS? I usually flash update BIOS. Thanks for the other ideas








Quote:


> Hmmm, well, next thing to do is pull the battery and short out the contacts for 15 seconds to make sure the BIOS is reset to defaults and try to boot again with NO graphics card installed... using onboard graphics only.


Hi Tex, Check!! Will do!
Quote:


> BE SURE to hook up a monitor that supplies EDID info, not some old crap like I have because I have discovered that a LOT of those NEW linux base fancy BIOS setups need that EDID info.


Did test with other monitor and keyboard. One of the first things I did. I usually have a line of things to do when solving these problems, but a black screen with no "beeps" (errors/warnings) I just didn't now what to do. But now I'll try what you suggested, both of you


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Guys,
> 
> Hi Magic, a new BIOS? But the new BIOS I can DL. But how can I update the BIOS without accessing the BIOS? I usually flash update BIOS. Thanks for the other ideas


That board has 2 bios chips and they are removable. You can remove one of the bios chips and either get a local shop to flash it or call Asrock.

I would call Asrock to see if the bios is the problem. The bios version that is shipped with the board is probably on a sticker on top of the bios, Asrock will be able to tell you if your CPU is compatible with that bios and give you options. I don't know how they handle that today, but they might just send you a bios chip with the correct bios for your CPU.


----------



## DanHansenDK

Hi Guys,

SUCCESS !!! Well, I tried the stuff we agreed on and found the problem! It was the CPU! I took the CPU from another test-bench, a i5-4670K which is "Haswell" architecture and then it worked just fine. With additional graphic card and without any 12volt HD connector









*"ASRock Z87 OC Formula" and i5-4690K CPU*
http://www.asrock.com/mb/Intel/Z87%20OC%20Formulaac/?cat=Download&os=BIOS
IMPORTANT! If you buy the this motherboard "ASRock Z87 OC Formula" and an i5-4690K CPU, remember to buy an additional CPU so that you can update the mobo for the i5-4690K.








I know it's me who made an error buying this CPU. But ASRock friends, please let us know, if we wan't to use the i5-4690x and i7-xx90x CPU along with this mobo, we need CPU that works before an update is possible !!!









I'll test the new CPU in my test bench, and if it works with the mobo ASUS Maximus VI Extreme, then I will just keep it there









Let's continue the test. And thanks for the help Magic and Tex


----------



## DanHansenDK

Hi Magic,
Quote:


> That board has 2 bios chips and they are removable. You can remove one of the bios chips and either get a local shop to flash it or call Asrock.


OK, this I didn't know







That's indeed a way to solve the problem. I didn't know that it was possible to remove the chipset from the board. Without a hacksaw that is







I'll look a the chips to identify this possibility and thanks a lot for letting me know


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> OK, this I didn't know
> 
> 
> 
> 
> 
> 
> 
> That's indeed a way to solve the problem. I didn't know that it was possible to remove the chipset from the board. Without a hacksaw that is
> 
> 
> 
> 
> 
> 
> 
> I'll look a the chips to identify this possibility and thanks a lot for letting me know


Glad you got it working


----------



## DanHansenDK

Hi Guys,

Thanks Tex, me too.. It always helps to hear from others! Especially others who know what they are talking about









Hi Tex,
Tex, I missed this post! Why I don't know:
Quote:


> You asked a lot of questions... well, you surely don't want to experience thermal runaway and about the only cure for that would be a case mod or change. Fact is, a 2U or 3U case may work better for you.


It is 2U cases! I chose 2U cases because CPU fans needed the space and because it's impossible to get more than 2 graphic cards into a 1U case. It's to small too, as you also said. It's just not right for the project. But, there's no need for 3 or 4U cases I think. Only if you want to use faster and better graphic cards. As you know I'm struggling to find a faster card than GT640 which can endure the same heat as this Asus card











I just rebuild the case! As you see at the picture, the PSU is placed in the front of the case. This because an ordinary PSU can be used. I don't like that! It's totally stu... to place the PSU there. All the heat from the PSU will be routed right in to the centre of the case where we are working to keep the temp. down!! Not a very smart solution. Well, smart for those who just need a 2U case and hasn't got issues with CPU's/GPU's which are working at 100% all the time a produces a ton of heat








This is how I place the PSU, HD etc. The HD is placed between the 2 front openings and the place where the PSU was suppose to be:



Quote:


> Most projects take little memory, others take 500Meg or more (NFS) up to 10Gig or more (Lattice) per WU. There would be a few projects you simply could not run and you would learn fast what those were when the virtual HD started thrashing. IF it was up to me, 2Gig per core would be nice... but likely most projects would do fine with 1Gig per core...


Please explain me about this Tex







Because I always thought that 2Gigs of memory was more than enough. One of the Guru's from SETI talked about this some time ago. So if I'm wrong, please let me know. I'm using 4 Gb for the test systems, but the new test system has got 8Gb. It was a better deal for me to by two at a time instead of one 4Gb module. So please explain why it's better to use 8 or even more, instead of 4. Because I was about to remove one 4Gb module to be used in the next test system. Test system 4 where I planned using water cooling


----------



## DanHansenDK

Hello friends









20:33 --> Updating BIOS - Very pleased, because I read the manual. To update the Asus mobo there were some pretty funny things I had to do. DL the update, extract it and rename the imagefile(.cab) to M6E.CAP








After this I'll update ASRock board. It's not needed because I changed CPU, but to avoid other issues. Why not do it every time you get a new mobo? Update the BIOS. I didn't think the BIOS would be so much older. Well, let's go.. and then on with the test.

23:39 --> BIOS updated on Asus mobo (my test bench) where I "lent" the Z87 CPU from and it now runs with the "Devils Canyon" CPU (i5-4690K) Remember I wished for a function to update the ASRock mobo without the need of a Z87 CPU to make it possible to update? Of course the Asus Maximus VI Extreme mobo had such a function!!! Just needed to copy the BIOS imagefile to a Fat32 Usb-key, rename it to M6E.CAP, plug it in the usb-connector on the backside of the mobo, and then press a special "update" button. Lights were flashing and after about 60 seconds the BIOS was updated!! Pretty fancy stuff. I guess you get some extra tools to play with, when paying more than double








OK. testing time. I've installed all 4 graphic cards and we are ready to install the "Headless Linux Multiple GPU Boinc Server". I'll use the Nvidia/grep command to check temp. of the GPU's. It's more accurate. Then, after the system is running, without getting to hot on card 4, I'll install the Fan-control/Temp.-control from AeroCool. Remember, the reason for using this peace of hardware is to avoid the issues which comes along the automated fan-control on the mobo's. Regardless of which setup you choose, the fan's are always increasing and decreasing, all the time. Never stops. This function is to save energy, of course, and this is pretty need to, but not when you are running 1 CPU and 4 GPU's 100%, 24/7/365 !!!! With AeroCool we get to control it ourselves and then we have a visual indicator as well. My shell scripts CPUWatchDog and GPUWatchDog, HDWatchDog and FanWatchDog will control all of it and shut the system down if anything get's too hot or something brakes down. But, it's nice to be able to see the numbers (RPM,Temp, etc) and when something goes wrong or e.g. a Fan stops working AeroCool will deploy an alarm. More about this later









First we'll make the ToDo, so that you can make a "Headless Linux Multiple GPU's Boinc Server" as well.
* because of the issues that appeared with the Ubuntu 12.04.4 update, and the CUDA5.5 debian packages wouldn't work, this ToDo is still based on the 12.10. I got a few things I would like to be solved, so if you have a solution, please don't hesitate to write us








Problem: # static IP . I'll show how it's done in 12.04, and here it works perfectly, it always has. But in 12.10 it doesn't. Please don't google it and write the result. We have already tried several things, so it's not an easy fix. Not that easy









Code:



Code:


# Reconfigure the network to static IP - Ubuntu Server 12.04:

Command: #1 vi /etc/network/interfaces
File:

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
        address 192.168.1.xxx
        netmask 255.255.255.0
        network 192.168.1.0
        broadcast 192.168.1.255
        gateway 192.168.1.1
        dns-nameservers 8.8.8.8 8.8.4.4

#2 
Command: # /etc/init.d/networking restart

#3 
Command: # vi /etc/hosts
File:

127.0.0.1       localhost.localdomain   localhost
192.168.1.xxx   xxxxxx.domain.tld               xxxxxx

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

But it's different in 12.10. We'll only need 12.10 until they fix the bug in the 12.04.4 update. But until then, it would be nice to be able to use static IP's








You can read about the "Bug" here ( look for *** ): https://developer.nvidia.com/cuda-toolkit-55-archive


----------



## Tex1954

Well, aside from the benefit of having dual channel memory operation with two modules, 2 Gig for each core will allow you to run NFS big tasks that are one of the best CPU points projects out there. If you are not concerned about NFS or Lattice or other large memory footprint projects, feel free to run the minimum of 1 Gig per core or less. Some projects take very little to be sure.

Looks awesome BTW...

You are doing a great job!


----------



## DanHansenDK

Here's the ToDo







This only took me about 3 month to solve. Well, it's not a thing to be proud about, I know. To work 3 month on solving the problem of bulding a "Rack-mounted Headless CLI Ubuntu Linux Multiple GPU Boinc Server". Some solve these things in a day I guess, but I've only been "doing" Linux for about a year and a half, so for me this was a huge step to take







I got some ideas from a few guy's at SETI, and I got a few good ideas from a guy in Netherland and then I got some help from you guys inhere







Indeed I did, but I mainly solved it because I tried again, and again, and again, and again....

The idea is to build a system which is semi-professional and not so pricey. So that an ordinary guy like me can build a system like this and add a new "Boinc Server" to the Rack every now and then. So that everybody can participate and be a part of the solution. The solution to support one or more of all the great Boinc projects out there. This is my way to help increasing the numbers of "members" or computers doing Boinc.

05:42 am --> The ToDo (only short version, showing how to install and set up a headless cli linux multiple gpu boinc cruncher)

07:51 am --> OK, have been trying to make a temporary ToDo, with some help regarding the Linux Server installation, but there's just to many unanswered issues, so here's the raw ToDo "How To Make A Headless CLI Ubuntu Linux Multiple GPU Boinc Server". I will make a complete ToDo showing everything. But here's how to make it work, if you already have a Ubuntu Server running (12.10).

Code:



Code:


HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: UBUNTU SERVER 12.04 64Bit
GPU'S: NVIDIA
VER.1.1.2 13.08.14 00:27:00

IMPORTANT! BECAUSE OF THE UBUNTU 12.04.4 UPDATE "BUG" THE CUDA5.5 DEBIAN PACKAGE WILL NOT WORK. THEREFORE 12.10 IS USED FOR NOW!
Read about it here ( look for *** ) [URL=https://developer.nvidia.com/cuda-toolkit-55-archive]https://developer.nvidia.com/cuda-toolkit-55-archive[/URL]

- LM-sensors - need it to run my shellscripts CPUTempWatchdog.sh,GPUTempWatchdog.sh,HDDTempWatchdog,FANcontrolWatchdog.sh
- Vim-nox - for easier VI and enhanced monitor look
- Midnight Commander - to manage files in the old-fashion way
- NTP Time Server Update - to keep server time updated
- bash as default shell - not sure about this, just yet.

- Boinc-client from ppa:costamagnagianfranco/boinc
- BoincTasks to remotely view and check jobs for all servers
- AndroBoinc to remotely view and check jobs from your Android
- Setup Boinc-client for remote control/SSH
- Setup Boinc-client for using multiple GPU's --> "cc_config.xml"

TODO'S - THESE ISSUES NEEDS TO BE DONE/SOLVED
- CRON set to run shell scripts
- CRON for other jobs?? Backup??
- GRUB boot - change startup config --> no halt! boot no mather what's wrong!
- Static IP for 12.10 !!!
- Setup Boinc-client for special applications --> "app-info.xml" Set application specific instructions!?
- Setup Boinc-client for special applications --> "global_prefs-override.xml" Set status "work" from the beginning!?

Install Ubuntu Server 12.10! And then:

#1 
Command: # apt-get install build-essential linux-headers-`uname -r`

#2 
Command: # wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1210/x86_64/cuda-repo-ubuntu1210_5.5-0_amd64.deb

#3
Command: # dpkg -i cuda-repo-ubuntu1210_5.5-0_amd64.deb

#4
Command: # apt-get update

#5
Command: # apt-get install cuda-5-5

#6
Command: # export CUDA_HOME=/usr/local/cuda-5.5

#7
Command: # export LD_LIBRARY_PATH=${CUDA_HOME}/lib64

#8
Command: # PATH=${CUDA_HOME}/bin:${PATH}

#9
Command: # export PATH

#10
Command: # reboot -h now

#11
Command: # apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

#12 
Command: # modprobe nvidia

#13
Command: # nvidia-xconfig --enable-all-gpus

#14
Command: # cp /etc/X11/XF86Config /etc/X11/xorg.conf

#15 Setup Boinc to use all GPU or selected GPU's:
Command: # vi /etc/boinc-client/cc_config.xml
<cc_config>
        <log_flags>

                <task>1</task>
                <sched_ops>1</sched_ops>
                <file_xfer>1</file_xfer>
                <app_msg_receive>1</app_msg_receive>
                <app_msg_send>1</app_msg_send>

                <cpu_sched_status>1</cpu_sched_status>
                <cpu_sched>1</cpu_sched>

                <gui_rpc_debug>0</gui_rpc_debug>
                <slot_debug>0</slot_debug>
                <std_debug>0</std_debug>
                <task_debug>0</task_debug>

        </log_flags>
        <options>

                <start_delay>10</start_delay>
                <use_all_gpus>1</use_all_gpus>
                <allow_remote_gui_rpc>0</allow_remote_gui_rpc> 
                <allow_multiple_clients>0</allow_multiple_clients>

        </options>
</cc_config>

#16 The file cc_config.xml needs a reboot. File will be read during bootup:
Command: # apt-get reboot -h now

#17 Install boinc-client and attach to projects using the cmd command line tool ;)

THIS TODO WILL BE UPDATED AND WILL IN THE END SHOW YOU HOW TO DO IT ALL.
I WILL POST AN UPDATED VERSION LATER TODAY


----------



## DanHansenDK

12:15 pm

Hi Guys!! OS installed. I'm about to start at the top of the list and build the "Headless CLI Ubuntu Linux Multiple GPU Boinc Server". But I just noticed that the Ubuntu 12.04.4 Update/CUDA5.5 .deb "Bug" seems to have been solved in the new CUDA version. CUDA6.0/Ubuntu 12.04 doesn't appear to have issues







Well, if this is true I will switch to 12.04 Server again. But, for now we will go on with the test, sticking to what we know will work! The task we have right now, is to test the heat when using 4 Asus GeForce GT640 Low Profile graphic cards, using ordinary case fans and the special fans that I got from my supplier









Here's a couple of links, in case you wan't to see/read about the "Bug" that I'm talking about:
CUDA5.5/Ubuntu 12.04.4 issues. (look for ***)https://developer.nvidia.com/cuda-toolkit-55-archive
CUDA6.0/Ubuntu 12.04. No remarks or warnings: https://developer.nvidia.com/cuda-downloads

Quote:


> Well, aside from the benefit of having dual channel memory operation with two modules, 2 Gig for each core will allow you to run NFS big tasks that are one of the best CPU points projects out there. If you are not concerned about NFS or Lattice or other large memory footprint projects, feel free to run the minimum of 1 Gig per core or less. Some projects take very little to be sure.


OK, I see. I checked BIOS, and "Dual Channel Memory" is active it says. Regarding NFS and Lattice, I'm not sure what this is. Sorry. I'll look it up









Let's go on with the installation. I'm doing a version 1.1.3 of the ToDo at the same time


----------



## DanHansenDK

17:22 pm --> Even more issues









12.10 is not supported anymore and can therefore not be used for this anymore. I used weeks to change from 12.04 --> 12.10 because of the update "bug" mentioned earlier on and now it's "out of date"











Well, that's life. I'll go right ahead with making a new ToDo using the 12.04. It should be possible using CUDA6.0. Haven't found any warnings or "bug's" yet. I just wanted to get going with the test, but there's no way around it I guess








Actually this "problem" solves another problem







The issue regarding static IP in 12.10 will be solved when returning to 12.04 because the configuration works perfectly in 12.04.











So, let's go.
(had to rest a little, but now we are ready to go on









*Making new ToDo - using Ubuntu Server 12.04.5 and CUDA 6.0*
19:09 pm

Code:



Code:


HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: UBUNTU SERVER 12.04.5 64Bit
CUDA: CUDA 6.0
VER.1.1.4 13.08.14 17:48:00

- LM-sensors - need it to run my shellscripts CPUTempWatchdog.sh,GPUTempWatchdog.sh,HDDTempWatchdog,FANcontrolWatchdog.sh
- Vim-nox - for easier VI and enhanced monitor look
- Midnight Commander - to manage files in the old-fashion way
- NTP Time Server Update - to keep server time updated
- bash as default shell - not sure about this, just yet.

- Boinc-client from ppa:costamagnagianfranco/boinc
- BoincTasks to remotely view and check jobs for all servers
- AndroBoinc to remotely view and check jobs from your Android
- Setup Boinc-client for remote control/SSH
- Setup Boinc-client for using multiple GPU's --> "cc_config.xml"

TODO'S - THESE ISSUES NEEDS TO BE DONE/SOLVED
- CRON set to run shell scripts
- CRON for other jobs?? Backup??
- GRUB boot - change startup config --> no halt on error! boot no matter what's wrong!
- Setup Boinc-client for special applications --> "app-info.xml" Set application specific instructions!?
- Setup Boinc-client for special applications --> "global_prefs-override.xml" Set status "work" from the beginning!?

NO FINISHED YET - WORKING ON THIS PAGE

23:31 pm --> Hello friends







I can't promise anything yet, but after making a new ToDo and setting up the system using Ubuntu Server 12.04 & CUDA 6.0, I just tested one thing. I wanted to know how many graphic cards is accepted after installing the CUDA ToolKit







It's looks good, but let's not get exited just yet









Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 33 C
        Gpu                         : N/A
        Gpu                         : 33 C
        Gpu                         : N/A
        Gpu                         : 34 C
        Gpu                         : N/A
        Gpu                         : 33 C

*01:53 am --> Houston, we've got a problem!*
As you can see above, all GPU's are accepted by the mobo. Boinc is running on the system, but the GPU's are not accepted! I got a couple of error messages when doing this step - step #11 in the ToDo from earlier on:

Code:



Code:


# apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

Reading package lists... Done
Building dependency tree
Reading state information... Done

E: Unable to locate package linux-image-extra-3.13.0-32-generic
E: Couldn't find any package by regex 'linux-image-extra-3.13.0-32-generic'

Please help me solve this issue







I think it's the only problem that needs to be solved before the test can go on. I've used several hours editing/redoing the ToDo 12.04.4 --> 12.10 and then back to 12.04 with the new 12.04.5 update. So if anyone has got an idea how to fix the problem, please don't hesitate to do so









*11:08 am --> Status:*
It looks like the new update 12.04.5 and CUDA6.0 has got issues as well. I've asked around at Ubuntu Forums, but it's not that easy when having problems with a totally new update (12.04.5). Only 7 days old! Ohh, I was just dreaming of being able to use 12.04 which I now so well for this project. It's just to bad there's all these issues.
I'm thinking, if nobody has got a solution to the problem, that I'll go on trying the combination of 13.10 and CUDA5.5 or 13.10 and CUDA6.0. Maybe there's not to many changes from 12.10 --> 13.10 . Then maybe I can get this command to work:

apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

I'm talking to a few guys still, so maybe we do not have to give in just yet. It's to bad, I never saw this coming. Totally forgot about the fact that 12.10 had a short "live" and that I had to fight the same problems again. Not just yet


----------



## DanHansenDK

*20:29pm --> STATUS*

OK, so far we've found no solutions to the issues regarding:

Code:



Code:


# apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

Reading package lists... Done
Building dependency tree
Reading state information... Done

E: Unable to locate package linux-image-extra-3.13.0-32-generic
E: Couldn't find any package by regex 'linux-image-extra-3.13.0-32-generic'

This is why I'll now try and install Ubuntu Server 13.10 (12.10 not supported any more) and CUDA 5.5 to see if this will work. If the difference between 12.10 and 13.10 is not to big, there's a change it will work. Actually I'm getting a little tired of these issues.
I'm getting to the point where another distro sounds like a good idea... Or maybe even windows







YES Of course!! Let's spend another US$ 200,- on each Boinc Server so that we can run 1 peace of software









*STATUS*

14.08.14 10:06:00pm
Problem! 12.10 not supported any more.
Try 12.04.5/CUDA6.0

13.08.14 11:31:00pm
12.04.5/CUDA6.0 Tested. NOT WORKING!!! Installation issues!
Try 13.10/CUDA5.5

14.08.14 10:06:00pm
13.10/CUDA5.5 Tested. NOT WORKING!!! Installation issues!
Try 13.04/CUDA6.0

15.08.14 01:41:00am
13.04/CUDA6.0 Tested. NOT WORKING!!! A LOT of installation issues!
Try 14.04.5/CUDA6.0

15.08.14 04:03:00am
Try 14.04.5/CUDA6.0. Solved the problem, but not in a very fancy way









Try 12.04.3/CUDA5.5 (Before CUDA/Ubuntu 12.04.4 Bug!)

*STATUS:*

15.08.14 22:05:00 --> OK, I cracked it, but my G.. !! I'm not sure about the stability of this configuration, because I really had to "cut some corners". Anyway, here's the proof:

Time & date: 15 Aug 2014, 20:01:58
Boinc version: 7.2.42
CPU: GenuineIntel Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz [Family 6 Model 60 Stepping 3](4 processors)
GPU: [4] NVIDIA GeForce GT 640 (1023MB) OpenCL: 1.1
OS: Linux 3.13.0-34-generic

I think I'll test one other thing I've been thinking about and then if this doesn't work, I'll install a Desktop version instead and go on with the test. Several days has past fighting this, and this was not the plan. So I'll try this other thing and then go on with the test. Then we can get back to the issue of making a stable version of a "Headless Linux CLI Multiple GPU Boinc Server"









*STATUS:*

18.08.14 06:41pm --> OK then... I had to wait for Asteroid's system to be up and running, so I couldn't test my new solution - the 14.04 combined with CUDA6.0. I was pretty sure it would work, but I couldn't know for sure.. So I waited!! Today Asteroid was up and running again so I was able to test it. It didn't work and it was no surprise to me, because there was a lot of short-cut's along the way.
So I made the next and second last attempt according to the "plan" which was to try the 12.04.3 version which should work along with CUDA5.5 (according to Nvidia's own "CUDA DL Zone Support"
I tried it, and there were only a few problems during the installation. So I tried to run Boinc without trying to correct these issues and it almost works as it's suppose to do. What we need is the 3 "extra" GPU's to "get into game"







But, this can be due to a lot of stuff or at least that's what I think.
This is why I'm writing again. Maybe one of you has got an idea. Is it perhaps a motherboard issue (there was plenty of them installing the mobo from the beginning), or is it maybe due to the issues during the installation? Or maybe it's a whole other reason which I haven't been thinking about!?!?

Please let me hear a few of your ideas









During setup, when following my ToDo version 1.1.5 and installing CUDA5.5 I ran into these issues when using this command:

Code:



Code:


apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package linux-image-extra-3.8.0-29-generic
E: Couldn't find any package by regex 'linux-image-extra-3.8.0-29-generic'

Or this one. There's only these two issues in total! Not many compared to the s... I've been dealing with the last 6 days or so









Code:



Code:


# nvidia-xconfig --enable-all-gpus

WARNING: Unable to locate/open X configuration file.

WARNING: Unable to parse X.Org version string.

Here's how it looks








4 jobs being chewed by the CPU and 1 being crunched by 1 out of 4 GPU's







We are almost there guys









Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 51 C  <-------- THIS ONE IS RUNNING AT FULL SPEED ;)
        Gpu                         : N/A
        Gpu                         : 30 C  <-------- zZzzZz zZZzZ zzZz
        Gpu                         : N/A
        Gpu                         : 31 C  <-------- zZzzZz zZZzZ zzZz
        Gpu                         : N/A
        Gpu                         : 30 C  <-------- zZzzZz zZZzZ zzZz

# sensors
coretemp-isa-0000
Physical id 0:  +66.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +66.0°C  (high = +80.0°C, crit = +100.0°C)  <-------- THIS ONE IS RUNNING AT FULL SPEED ;)
Core 1:         +64.0°C  (high = +80.0°C, crit = +100.0°C)  <-------- THIS ONE IS RUNNING AT FULL SPEED ;)
Core 2:         +63.0°C  (high = +80.0°C, crit = +100.0°C)  <-------- THIS ONE IS RUNNING AT FULL SPEED ;)
Core 3:         +59.0°C  (high = +80.0°C, crit = +100.0°C)  <-------- THIS ONE IS RUNNING AT FULL SPEED ;)

fan1:            0 RPM  (min =    0 RPM)  ALARM
fan2:         8490 RPM  (min =    0 RPM)  ALARM
fan3:            0 RPM  (min =    0 RPM)  ALARM
fan4:            0 RPM  (min =    0 RPM)  ALARM
fan5:            0 RPM  (min =    0 RPM)  ALARM


----------



## DanHansenDK

*STATUS:*
20.08.14 01:55am

Hello friends







We got it !!!! Solved the b..... problem









And now we are running Ubuntu 12.04 which means the "time of static IP trouble" is over !! Here's the proof:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 52 C
        Gpu                         : N/A
        Gpu                         : 56 C
        Gpu                         : N/A
        Gpu                         : 57 C
        Gpu                         : N/A
        Gpu                         : 54 C



OK, there's still a few issues I need to solve. This is an "elderly" version of 12.04. It's the 12.04.3 update to avoid the problem with CUDA5.5. But I'll fix it later on. Now it's on with the test.

I'm testing 1 CPU and 4 GPU's at full speed right now. In this test I'm using the "standard" fans. This is how it looks after about 10min. or so:

10 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 56 C
        Gpu                         : N/A
        Gpu                         : 57 C
        Gpu                         : N/A
        Gpu                         : 65 C
        Gpu                         : N/A
        Gpu                         : 56 C

20 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 67 C
        Gpu                         : N/A
        Gpu                         : 57 C

Notice that it's card in socket d2 (third card in the row - and the second last). Actually I would have thought, that the card next to the 2U PSU would have been the one with issues. There's not much room between the card's heatsink and the PSU and therefore not much air. There's no fan pointing directly at it either! This is why tests are such a great thing









AeroCool temp. sensor shows the same thing. But the temp. lies 5-6 degrees C below the system output. I guess I've chosen a bad spot to place the sensors. That's too bad, because I did a little exta making pictures of the installation/mounting of the sensores, using strips, insulators etc. Remember, this will all be part of the complete tutorial I'll do









30 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 66 C
        Gpu                         : N/A
        Gpu                         : 57 C

40 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 67 C
        Gpu                         : N/A
        Gpu                         : 57 C

50 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 57 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 66 C
        Gpu                         : N/A
        Gpu                         : 57 C

After 1 hour of testing CPU & 4 GPU's at 100% - this is the result, using the standard fans and no modifications to the case:

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +78.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +78.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +73.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +69.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +68.0°C  (high = +80.0°C, crit = +100.0°C)

# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 66 C
        Gpu                         : N/A
        Gpu                         : 57 C

I think we can say the temp. using the standard fans has been found








The CPU temp. is a little high as well. I hope this will all go away when changing the to the new type of fans.

Another thing. Just learned that there's 4 chassis/case fan connectors on this mobo. But, again the fan's are running according to CPU temp. If only the manufacturers would leave these very nice add-on's alone. Stop making all kind of funny "set-up's" to make it easy! You are making it difficult to use the board for more than one kind of operation. If these 4 extra fan connectors had been working alone, it would have been possible to use fan control from lm-sensors to make the most beautiful script to to control and watch the fans. This is the reason I use AeroCool. Well, I've written about that several times, so let's not go down road again, but it's just too bad...

On with the test. Let's change the fans and se the difference


----------



## Tex1954

Wow, a lot of work! Glad to see it's starting to pay off!


----------



## magic8192

Seems to be coming along nicely


----------



## DanHansenDK

Hello Tex,

Thanks my friend.. And yes, it was a little more than I first expected starting the test. Forgot all about the short time support of Ubuntu 12.10. That's why it's great we got it to work using 12.04, even though it's an older version of it








There's a few "hickups", but nothing really hard to solve. Just takes some time, which I'm not prepared to use right now.

OK, it's time to change the fans and see if the new fans, which runs twice as fast, will make a difference









Thanks for checking in on me Tex








*
STATUS:*
20.08.14 12:58am

Got a few pictures I would like to show you. Mounting/placing the sensors on the GPU's. A little harder than you might think to begin with








*
Image 1. Case/chassis heat issues*
Red: Here you can see the air flow from the 3 out of 4 chassis/case fans. Being placed the way they are, I always guessed, that GPU4/graphic card 4 (next to 2U PSU - in the bottom of the image). As we know now, it's not this card which gets hottest. It's card 3. The little red box in the lower right corner of the image, is where the 4'th fan is placed. This points directly at the intake/fan1 in the 2U PSU. It's good for the PSU, but not for the GPU's








Green: This is the area which I thought would make the biggest issues, because of the limited space between PSU and GPU 4's heatsink. I'm not quite sure why this isn't so. Is it because of the materials the heatsinks are made of (the 4 graphic cards heatsink is not quite the same. The metal looks different on some of them) Maybe it's paint, I dont know). Normally this would be a great reason for a new another test, but in this cae we just press on with the test we are doing








Blue: Here you can see that I've didn't mount the "low profile vga brackets". I only did that on the first card, because I use this during the first part of the installation. When the set-up/installation is done, it'll be unmounted as well. I leave area open, because of the hot air. To make it easier for the fans to blow the hot air out of the case.



*Image 2. Showing the "missing" low profile vga brackets & air-flow.*
Green & Red: As written above, the space between the 2U PSU and GPU 4/graphic card 4 was always my guess on coursing the heat issues. It was heat issues to me, because the temp. got very high, close to 70 degrees celsius inside the case and this increased the temp. of the other stuff inside. And it burned my fingers as well. This was too hot in my opinion







I like this to be a stable running system which will be able to run for 10 years without any trouble and need for repairs.
Blue: I keep these open, so that the fans has a chance to blow out the very hot air. This case is very ventilated, but the air-flow is really wrong when using the rear side brackets etc. I like to be able to control the direction the air is moving. I don't like the hot air to come out in the front, when all fans, the CPU fan included, is pointing backwards











*Image 3. Where to place the sensors.*
Red: Here you can see the sensors we need to mount. In this case we mount it directly on the heatsink. It's very important that the sensor is placed in such a way, the air flow from the fans doesn't hit it directly. It's a bit difficult, because there's really not a lot of room here. but it's possible.



*Image 4. Mounting the sensor & securing it to the curcuit board.*
Red: Here you can see where to place the sensor. Use the tape which comes alog with AeroCool X-Vision, but use an extra peace of tape to keep it in place. I use a special kind of tape, which in Denmark is called "Fe-hud" (fairy skin). It's the kind you use when placing stuff on e.g. a articel before you copy it on the xerox








Green: The plastic which protected the sensor, I use to protect the little sticker (showing sensor number) and to protect the wire against heat. It eases the mounting of the wire as well.



*Image 5 & 6. Using a strip* to mount and secure the sensor from falling of (* is it called a strip over htere??)*
Red: Here you can see how I mounted the wire using a "*strip". This way it's kept in place and the sensor will not fall of when working on the case/chassis. Please notice the little hole in the top right corner. It's very nice of Asus to make such a little mounting-hole in the circuit board for us












*Image 7 & 8. Placing and securing sensor on CPU.*
Red: On image 7 you can see the whole CPU area, where the sensor is being placed. On image 8 you can see it a little better. I chose to place the sensor where I did, because it's the place closest to the CPU. It's a good place because it's not directly in line of the air-flow from the fans. The sensor is secured the same way as the sensors on the GPU's using a *strip and the plastic tube. (the plastic tube is used to protect the sensor during installation) It's possible that we can find a better place for the sensor. The numbers are not exactly the same as the internal sensor. Of course not, but it would be nice to be a little closer to "the truth". Or maybe not!? Maybe it's a good thing to know the difference between the inside of the curcuits and the temp. right next to them. I would love to hear some remarks regarding this matter from you overclockers







I know you know a great deal about heat and temperatures







So please get back to me on this












Question!! What's the right word? Sensor is ...... placed/installed/mounted ????

_....more to come_

.


----------



## Tex1954

You know, a lot of folks that assemble systems merely assume things will be fine... you have really gone the extra mile! This is all very interesting to me too. I have a laser temperature gun I use to check things out and it works pretty well considering position limitations. Your temperature sensor placements seem ideal to me.

In any case, please keep us updated! Yours is an awesome setup!


----------



## DanHansenDK

Hi Tex,

Thanks my friend








Quote:


> I have a laser temperature gun I use to check things out and it works pretty well considering position limitations.


WHAT A F...... GREAT IDEA !!! I got such a "gun", but I've never thought about it that way!! Thanks Tex !! This is why I prefer overclock.net . No BS, only constructive ideas and suggestions !! Thanks my friend









*STATUS:*
20.08.14 6:08pm --> Just started to desmount the standard fans! The fan I'll test first is:
_The german 8 x 8 cm PAPST Industrial Fan
Type 8412 N/2GH.
12V. DC / 235mA / 2.8W_
I haven't got all the specs right now, but the RPM is more than twice as fast. The depth of the fan is the same as the standard fan, so the result must be better, more or less







The "Tornado" fan is long overdue. Haven't got it yet even though I ordered it fro more than 3 weeks ago or so! Anyroad, I wanted to test this fan first. It's an industrial fan used in many machines in the industry and it's not so costy







Actually, I was able to make a deal with my supplier, which made them even cheaper. Well, they are not that cheap, but they are cheaper than the "Tornado" fan mentioned earlier on









BTW Tex, geographically, where in the world are you? Just querious







State & City ?? Texas maybe









On with it









*STATUS:*
21.08.14 00:49am --> The fan upgrade is pretty easy to do, because of the chassis/case design. (Remember the fan sequence - notice the small labels 1 - 5. Use 2-4. Fan 1 is the CPU fan connector). Just unplug the fan connectors from AeroCool and then unmount the combined "fan holder". Remove the fan guard/grill by unscrewing the 4 screws. After that you unscrew the fan, by unscrewing the 4 screws on the other side of the "fan holder". Now mount the 4 new fans by doing it all backwards













Here's a little teaser while waiting for the new fans to be installed and tested. Trying to make a script which controls the fans on the graphic cards. It would be nice to succeed with "GPUFanWatchDog.sh"







This is what's on my mind when "turning in". Love to think on these things when trying to fall asleep







Usually I don't get much further this way, but I solve a lot of issues this way







Here's a "layout" of the script going tho be "GPUFanWatchDog.sh". The other *****WatchDog.sh scripts are done, more or less







I'm not satisfied, not just yet. I want to do it another way. As it is now, it's more or less like those 2-3 standard mobo fan programs. And I need to add variables so that the script check all GPU's. If there's 2 GPU's it'll check those 2, if there's 4 GPU's it'll check those 4. So it's a script in the making. We are going to use the Nvidia-SMI command again:

Problems: Need to set Coolbits in xorg.conf to "4" to enable the fan control sliding bar under "Thermal Settings"

Code:



Code:


#!/bin/bash

# GPUFanWatchDog.sh v.0.1.0 
# Checks Nvidia GPU's temperature and adjusts fan speed accordingly
# CRON job
#

# Set update interval in seconds
interval=5

# Set min. and max. temperatures in degrees celsius
min_threshold=45
max_threshold=60

# Set a number between 35 and 100 for fan speed. Speed is in percentage of maximum
min_speed=35
mid_speed=60
max_speed=100

# continually loop
while [ "1" -eq "1" ]; do
    #get current temperature and fan speed
    current_temp=`nvidia-smi -q -d TEMPERATURE | grep Gpu | sed 's/.*\([0-9]\{2\}\).*/\1/'`
    current_speed=`nvidia-smi -q | grep Fan | sed 's/.* \(1\?[0-9]\{2\}\) .*/\1/'`

    #check current temperature and adjust fan speed
    #only set the speed if it actually needs changing, as nvidia-settings eats CPU cycles
    if [[ $current_temp > $min_threshold ]]; then
        if [[ $current_temp > $max_threshold ]]; then
            #if temp greater than 60, set fan speed to maximum
            if [[ $current_speed != $max_speed ]]; then
                nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUCurrentFanSpeed="$max_speed" > /dev/null
            fi
        fi
        #if temp greater than 45, set fan speed to medium
        if [[ $current_speed != $mid_speed ]]; then
            nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUCurrentFanSpeed="$mid_speed" > /dev/null
        fi
    else
        #if temp below 45, set fan speed to minimum
        if [[ $current_speed != $min_speed ]]; then
            echo set
            nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUCurrentFanSpeed="$min_speed" > /dev/null
        fi
    fi

    #wait until interval expires before rechecking
    sleep "$interval"
done

*Problem regarding the missing value under "Fan speed" when using the command "nvidia-smi -a" in e.g. a script to control the fan on the GPU's*
OK, solved the "fan speed" issue as well







Didn't know this, so I had to "study" a little while







We need the value under "Fan speed" in the script. If the value is "0" the script will not work of course. It's pretty simple and straight forward. Here's how we do it

How to enable "Fan speed" value in nvidia-smi:

Edit */etc/X11/xorg.conf* in "Section Device" and add this line *Option "Coolbits" "4"*

Command:

Code:



Code:


vi /etc/X11/xorg.conf

Add the line. It should look like this:

Code:



Code:


Command: # vi /etc/X11/xorg.conf

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 319.37  ([email protected])  Wed Jul  3 18:14:07 PDT 2013

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "Module"
    Load           "dbe"
    Load           "extmod"
    Load           "type1"
    Load           "freetype"
    Load           "glx"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "keyboard"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GT 640"
    Option         "Coolbits" "4"       <--------------------- ADD THIS LINE FOR EVERY "DeviceX" 
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

OK, a few "hickups" here as well







I'll get back to it, let's finish the test. It's long overdue


----------



## DanHansenDK

*STATUS:*
22.08.14 11:04am <

Dismounting the standard fans are done. Let's mount the new fan's

*Mounting the new fan's in the 2U chassis/case.*

*Image1. Let's do this







*
Try to fit as many of the wires into the "front" of the case. I tried to keep as few wires and other stuff as possible in the "back" where the mobo is. Everything that is in the way of the air-flow, is a no-go. I'll get this into the complete ToDo as well











*Image 2. Mounting the new fan, but!*
Red: It's now pretty straight forward. Just mount one fan after another using the screws from before. But. please notice one thing.
*IMPORTANT!* When fitting the screws, DO NOT TIGHTEN IT TO MUCH!!! I'ts so very important that you do not tighten the screws to much. It's made of plastic, so of course you can not tighten them very much, then the material will brake. But what's even more important, the fan will form after the holder/bracket which they are placed in. This means that tightening them too much, may result in a fan that'll crash or even run very hot. With this I mean so hot that the ball bearings and the plastic will melt. It's serious, but if this happens you will be warned by the shell script we are making and AeroCool temp./fan control as well. So it's all good, but it's something to keep in mind anyroad.
The right way to fit the fan and tighten the screws, is to tighten them using the smallest possible screwdriver as possible (less power used that way) and then tighten it to the point where you can feel some resistance. You can see when the screw is all the way in too. Tighten it to that point and then unscrew it a little. Then tighten it again, and you'll feel the point where it's not to be tighten more, much easier. When you tighten the screw the first time, the screw "cut" itself through the plastic, and made a "thread*". Tightening it the second time is therefore much easier and it's much easier to keep it tightened without forcing it too much!
* _thread_ (it's what my dictionary told me, so I hope it's correct)



*Image 3 & 4. Mounting the fan, please notice the wires!*
When you are mounting the fan, please notice the wires on the other side! The fan has a little plastic "thingy" which job is to hold the wires in place. When all 4 screws has been tightened and the wires will be kept in place by this, but you may have to work a little to get it to be like that. On these two images you can see what I'm talking about. On the first image you can see what may happen when mounting the fan. Please keep this in mind when fitting the fan to the case. If you already tightened the fan and this is what happened, unscrew all 4 screws a third of the way and then correct the problem.




*Image 4. Fitting the fan grill*
Mount the fan grill with the same caution, as when you mounted the fan itself. It's straight forward here.



*Image 5. Notice the wires*
Green: Please notice the wires, when finishing the fan installation. Here the wires may "pop out" of the "thingy" which holds them in place







It's important that a wire isn't loose and maybe gets in the way of the fan spinning. It's always very important to keep an eye on these things. This is what courses most accident's and breakdowns. Wires and connectors not fitted in the right way!!



*AeroCool and what it does*
Now we have mounted the new fan's from PAPST too. Let's begin the test of temperatures. The AeroCool controller will detect the temp. on each of the sensors and if it reaches a limit, set by you of course, it'll increase it's RPM. Both the "lower" limit (the fan runs at this speed to begin with) and the "alarm limit" (the temp. where the fan starts to increase the RPM to cool the system off) is set by you. Setting these temp. is not a must. You can just choose to have non "lower limit" and you can choose the fan to run at full speed at all time too. It's also possible to set an "alarm", which goes off when a certain temp. is reached. This is one of the reasons I choose to use an temp./fan-controller. Then the fan's will run at the speed you set and nothing else, no matter what "mood" you mobo is in









NOTE! Regarding the AeroCool temp./fan controller it's not all good. I've noticed a problem with the LCD display. The light seems to get dimmer as time passed by. I'm not sure about this, but this will indeed be a thing I'll check "down the road". It may just be a glitch on the "older" AeroCool", but we need to be sure about this. Don't want any circuit-boards with sh**ty solderings. Or as my old electronic teacher named them "poo-solderings"









*STATUS:*
22.08.14 1:34pm <--- Testing the new fan's at 100%
Let's test the new fan's and compare the result to the result using the standard fans. We'll do this first test running the fan's at 100% to se how much difference they can do at most. Then we'll see if the fan's at e.g. 80% would be enough. Let's see what the test will reveal









The standard fan's turned with 1900-2100 RPM's according to AeroCool.
Here's the temp. result from those fan's:

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +78.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +78.0°C  (high = +80.0°C, crit = +100.0°C)  <------- here is one issue. To close to the temperature where the CPU needs cooling
Core 1:         +73.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +69.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +68.0°C  (high = +80.0°C, crit = +100.0°C)

# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 58 C
        Gpu                         : N/A
        Gpu                         : 66 C  <------- here is one issue. To close to the temperature where the GPU needs cooling*
        Gpu                         : N/A
        Gpu                         : 57 C

# Know that it endures 95 degrees Celsius, but it heats up the case/environment.


----------



## DanHansenDK

*STATUS:*
22.08.14 1:58pm <---- Testing the new fan's. Here's the results:

After 10 min. GPU's at 100% / Fan's at 3300 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 51 C
        Gpu                         : N/A
        Gpu                         : 48 C

After 20 min. GPU's at 100% / Fan's at 3300 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 51 C
        Gpu                         : N/A
        Gpu                         : 49 C

After 1 hour. GPU's at 100% / Fan's at 3300 RPM:

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +79.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +79.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +74.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +70.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +69.0°C  (high = +80.0°C, crit = +100.0°C)

# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 51 C
        Gpu                         : N/A
        Gpu                         : 49 C

*SUCCESS !!!*
Well, I don't think "Success" is to big a word to use right now








The temp. has gone down 15 degrees Celsius! So I was right I guess. There was not enough air-flow inside the case! This really saves my day








The CPU temp. hasn't changed, but tahts OK, because it's not being controlled by AeroCool. When connecting it I noticed that it worked, but there was no output on AeroCool (RPM/display) so there's an issue there to be solved as well. This means that the 2U CPU cooler is being controlled by the mobo, and therefore only increase the RPM when needed. Mobo has several different set-up's controlling chassis/CPU fan's as you might know. More about this later on. Right now let's just be thankful for the result. We may be able to build this Economic Semi-Professional Boinc Cruncher without water-cooling and save some cash









*STATUS:*
22.08.14 3:25pm <---- Testing the new fan's at a lower speed:

After 10 min. GPU's at 100% / Fan's at 2500 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 48 C
        Gpu                         : N/A
        Gpu                         : 51 C
        Gpu                         : N/A
        Gpu                         : 55 C
        Gpu                         : N/A
        Gpu                         : 52 C

After 20 min. GPU's at 100% / Fan's at 2500 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 49 C
        Gpu                         : N/A
        Gpu                         : 52 C
        Gpu                         : N/A
        Gpu                         : 55 C
        Gpu                         : N/A
        Gpu                         : 52 C

*STATUS:*
22.08.14 3:45pm <---- Testing the new fan's at a little higher speed:

After 10 min. GPU's at 100% / Fan's at 2800 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 49 C
        Gpu                         : N/A
        Gpu                         : 53 C
        Gpu                         : N/A
        Gpu                         : 51 C

After 30 min. GPU's at 100% / Fan's at 2800 RPM:

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 49 C
        Gpu                         : N/A
        Gpu                         : 53 C
        Gpu                         : N/A
        Gpu                         : 51 C

*CONCLUSION:*
22.08.14 4:28pm
I think we can conclude, that these fan's are all we need for this system to run. Now I'll start a "burn-in" test, where we'll hit it with all we got and let it run for 48 hours. I've just written a little shell script which takes the temperature every hour and logs it. Let's see how our "pour mans super cruncher" handles a little work









I think this looks pretty good:

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +66.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +66.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +66.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +64.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 50 C
        Gpu                         : N/A
        Gpu                         : 53 C
        Gpu                         : N/A
        Gpu                         : 51 C

_I will be back....._

.


----------



## Tex1954

Nice progress! I like your attention to detail as well.

And hopefully, what works for one setup is repeatable in all setups.

Nice work so far! Looks really good!










PS: I moved from Texas to Tennessee and then to Paducah, Kentucky as my final retirement venue near good fishing and cheap housing...


----------



## DanHansenDK

Hi Tex,

Thank you







Nice to know who you are talking to, in wich direction "they" are anyway. Think you know what I mean. Sounds really nice where you are







We have to stay put, in this our home for more than 10 years, well actually I was raised in this area. Koege, close to Copenhagen, but we are going to our dreamhouse when I've finished university. It's a little late in life to become a student again, but I needed the knowhow. What I'm learning these 3 years, I couldn't "pick up" myself. Just not possible. I needed to learn the about SMD components and how to replace, test and check them. So I took chose to take this electronics engineer degree. ANyway, thanks for letting me know









OK. Regarding the test. There's a little problem. The system suddenly rebootet and I've just checked the system uptime:

Code:



Code:


# uptime
 22:11:04 up 16 min,  1 user,  load average: 4.78, 4.73, 4.53

This was the reason I wanted to get the temperature of the GPU's down to begin with. That is, if it' is the same thing thats coursing it to reboot. Well, I'll let the system run a little while longer to see if it happenes again. Back then, testing system 3 with the Asus Extreme mobo, the chrashes and reboots where many more. I'm not sure if this is the same thing but I think it might is. Let's see whathappenes. We still got 500 more RPM's to deal with







Just didn't think these temp. could course it to "crash". It's much hotter near the heatsinks of the GPU's. but we did manage to lower the temp. 11-15 degrees Celsius.

Code:



Code:


# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 49 C
        Gpu                         : N/A
        Gpu                         : 50 C
        Gpu                         : N/A
        Gpu                         : 55 C
        Gpu                         : N/A
        Gpu                         : 51 C

*GOT AN IDEA???*
Anybody who has an idea of what the reason for the crash & reboot was???
The system rebooted about 21:24 o'clock (9:24pm)

Code:



Code:


# vi syslog
Aug 22 13:41:29 beaufort dhclient: DHCPACK of 192.168.1.29 from 192.168.1.1
Aug 22 13:41:29 beaufort dhclient: bound to 192.168.1.29 -- renewal in 42149 seconds.
Aug 22 13:41:29 beaufort kernel: [    5.824371] type=1400 audit(1408707689.750:8): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=1065 comm="apparmor_parser"
Aug 22 13:41:29 beaufort kernel: [    5.824536] type=1400 audit(1408707689.750:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=1065 comm="apparmor_parser"
Aug 22 13:41:29 beaufort kernel: [    5.824628] type=1400 audit(1408707689.750:10): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=1065 comm="apparmor_parser"
Aug 22 13:41:29 beaufort kernel: [    5.825656] type=1400 audit(1408707689.750:11): apparmor="STATUS" operation="profile_load" name="/usr/sbin/tcpdump" pid=1067 comm="apparmor_parser"
Aug 22 13:41:29 beaufort cron[1120]: (CRON) INFO (pidfile fd = 3)
Aug 22 13:41:29 beaufort acpid: starting up with proc fs
Aug 22 13:41:29 beaufort cron[1144]: (CRON) STARTUP (fork ok)
Aug 22 13:41:29 beaufort cron[1144]: (CRON) INFO (Running @reboot jobs)
Aug 22 13:41:29 beaufort acpid: 1 rule loaded
Aug 22 13:41:29 beaufort acpid: waiting for events: event logging is off
Aug 22 13:41:32 beaufort kernel: [    8.893501] NVRM: os_pci_init_handle: invalid context!
Aug 22 13:41:32 beaufort kernel: [    8.893504] NVRM: os_pci_init_handle: invalid context!
Aug 22 13:41:38 beaufort ntpdate[1007]: adjust time server 91.189.94.4 offset 0.307711 sec
Aug 22 13:42:09 beaufort dbus[725]: [system] Activating service name='org.freedesktop.ConsoleKit' (using servicehelper)
Aug 22 13:42:09 beaufort dbus[725]: [system] Activating service name='org.freedesktop.PolicyKit1' (using servicehelper)
Aug 22 13:42:09 beaufort polkitd[1439]: started daemon version 0.104 using authority implementation `local' version `0.104'
Aug 22 13:42:09 beaufort dbus[725]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Aug 22 13:42:09 beaufort dbus[725]: [system] Successfully activated service 'org.freedesktop.ConsoleKit'
Aug 22 14:17:01 beaufort CRON[2013]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 15:17:01 beaufort CRON[2069]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 15:52:29 beaufort kernel: [ 7858.605739] setiathome_7.01[2189]: segfault at ffffffffffffffc8 ip 0000000000763244 sp 00007fff563ef1d8 error 7
Aug 22 16:17:01 beaufort CRON[2256]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 17:17:01 beaufort CRON[2338]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 18:17:01 beaufort CRON[2386]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 19:17:01 beaufort CRON[2428]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 20:17:01 beaufort CRON[2484]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Aug 22 21:17:01 beaufort CRON[2525]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)

# THIS IS WHERE THE SYSTEM CHASHED AND REBOOTED !!!

Aug 22 21:24:48 beaufort kernel: imklog 5.8.6, log source = /proc/kmsg started.
Aug 22 21:24:48 beaufort rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="792" x-info="http://www.rsyslog.com"] start
Aug 22 21:24:48 beaufort rsyslogd: rsyslogd's groupid changed to 103
Aug 22 21:24:48 beaufort rsyslogd: rsyslogd's userid changed to 101
Aug 22 21:24:48 beaufort rsyslogd-2039: Could not open output pipe '/dev/xconsole' [try http://www.rsyslog.com/e/2039 ]
Aug 22 21:24:48 beaufort kernel: [    0.000000] Initializing cgroup subsys cpuset
Aug 22 21:24:48 beaufort kernel: [    0.000000] Initializing cgroup subsys cpu
Aug 22 21:24:48 beaufort kernel: [    0.000000] Linux version 3.8.0-29-generic ([email protected]) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 (Ubuntu 3.8.0-29.42~precise1-generic 3.8.13.5)
Aug 22 21:24:48 beaufort kernel: [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.8.0-29-generic root=/dev/mapper/beaufort--vg-root ro
Aug 22 21:24:48 beaufort kernel: [    0.000000] KERNEL supported cpus:
Aug 22 21:24:48 beaufort kernel: [    0.000000]   Intel GenuineIntel
Aug 22 21:24:48 beaufort kernel: [    0.000000]   AMD AuthenticAMD
Aug 22 21:24:48 beaufort kernel: [    0.000000]   Centaur CentaurHauls
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: BIOS-provided physical RAM map:
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000088a05fff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000088a06000-0x0000000088a0cfff] ACPI NVS
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000088a0d000-0x0000000089545fff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000089546000-0x000000008998ffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000089990000-0x000000009cb0dfff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000009cb0e000-0x000000009d088fff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000009d089000-0x000000009d0c8fff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000009d0c9000-0x000000009d171fff] ACPI NVS
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000009d172000-0x000000009dffefff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x000000009dfff000-0x000000009dffffff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000025effffff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] NX (Execute Disable) protection: active
Aug 22 21:24:48 beaufort kernel: [    0.000000] SMBIOS 2.7 present.
Aug 22 21:24:48 beaufort kernel: [    0.000000] DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 OC Formula, BIOS P2.10 07/17/2014
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==> reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
Aug 22 21:24:48 beaufort kernel: [    0.000000] No AGP bridge found
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: last_pfn = 0x25f000 max_arch_pfn = 0x400000000
Aug 22 21:24:48 beaufort kernel: [    0.000000] MTRR default type: uncachable
Aug 22 21:24:48 beaufort kernel: [    0.000000] MTRR fixed ranges enabled:
Aug 22 21:24:48 beaufort kernel: [    0.000000]   00000-9FFFF write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   A0000-BFFFF uncachable
Aug 22 21:24:48 beaufort kernel: [    0.000000]   C0000-CFFFF write-protect
Aug 22 21:24:48 beaufort kernel: [    0.000000]   D0000-E7FFF uncachable
Aug 22 21:24:48 beaufort kernel: [    0.000000]   E8000-FFFFF write-protect
Aug 22 21:24:48 beaufort kernel: [    0.000000] MTRR variable ranges enabled:
Aug 22 21:24:48 beaufort kernel: [    0.000000]   0 base 0000000000 mask 7E00000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   1 base 0200000000 mask 7FC0000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   2 base 0240000000 mask 7FF0000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   3 base 0250000000 mask 7FF8000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   4 base 0258000000 mask 7FFC000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   5 base 025C000000 mask 7FFE000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   6 base 025E000000 mask 7FFF000000 write-back
Aug 22 21:24:48 beaufort kernel: [    0.000000]   7 base 00C0000000 mask 7FC0000000 uncachable
Aug 22 21:24:48 beaufort kernel: [    0.000000]   8 base 00A0000000 mask 7FE0000000 uncachable
Aug 22 21:24:48 beaufort kernel: [    0.000000]   9 disabled
Aug 22 21:24:48 beaufort kernel: [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
Aug 22 21:24:48 beaufort kernel: [    0.000000] original variable MTRRs
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 0, base: 0GB, range: 8GB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 1, base: 8GB, range: 1GB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 2, base: 9GB, range: 256MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 3, base: 9472MB, range: 128MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 4, base: 9600MB, range: 64MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 5, base: 9664MB, range: 32MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 6, base: 9696MB, range: 16MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 7, base: 3GB, range: 1GB, type UC
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 8, base: 2560MB, range: 512MB, type UC
Aug 22 21:24:48 beaufort kernel: [    0.000000] total RAM covered: 8176M
Aug 22 21:24:48 beaufort kernel: [    0.000000] Found optimal setting for mtrr clean up
Aug 22 21:24:48 beaufort kernel: [    0.000000]  gran_size: 64K         chunk_size: 32M         num_reg: 6      lose cover RAM: 0G
Aug 22 21:24:48 beaufort kernel: [    0.000000] New variable MTRRs
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 0, base: 0GB, range: 2GB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 1, base: 2GB, range: 512MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 2, base: 4GB, range: 4GB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 3, base: 8GB, range: 1GB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 4, base: 9GB, range: 512MB, type WB
Aug 22 21:24:48 beaufort kernel: [    0.000000] reg 5, base: 9712MB, range: 16MB, type UC
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: update [mem 0xa0000000-0xffffffff] usable ==> reserved
Aug 22 21:24:48 beaufort kernel: [    0.000000] e820: last_pfn = 0x9e000 max_arch_pfn = 0x400000000
Aug 22 21:24:48 beaufort kernel: [    0.000000] found SMP MP-table at [mem 0x000fd9b0-0x000fd9bf] mapped at [ffff8800000fd9b0]
Aug 22 21:24:48 beaufort kernel: [    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
Aug 22 21:24:48 beaufort kernel: [    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
Aug 22 21:24:48 beaufort kernel: [    0.000000] Using GB pages for direct mapping
Aug 22 21:24:48 beaufort kernel: [    0.000000] init_memory_mapping: [mem 0x00000000-0x9dffffff]
Aug 22 21:24:48 beaufort kernel: [    0.000000]  [mem 0x00000000-0x7fffffff] page 1G
Aug 22 21:24:48 beaufort kernel: [    0.000000]  [mem 0x80000000-0x9dffffff] page 2M
Aug 22 21:24:48 beaufort kernel: [    0.000000] kernel direct mapping tables up to 0x9dffffff @ [mem 0x1fffe000-0x1fffffff]
Aug 22 21:24:48 beaufort kernel: [    0.000000] init_memory_mapping: [mem 0x100000000-0x25effffff]
Aug 22 21:24:48 beaufort kernel: [    0.000000]  [mem 0x100000000-0x23fffffff] page 1G
Aug 22 21:24:48 beaufort kernel: [    0.000000]  [mem 0x240000000-0x25effffff] page 2M
Aug 22 21:24:48 beaufort kernel: [    0.000000] kernel direct mapping tables up to 0x25effffff @ [mem 0x9d0c7000-0x9d0c8fff]
Aug 22 21:24:48 beaufort kernel: [    0.000000] RAMDISK: [mem 0x3604a000-0x3701cfff]
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: RSDP 00000000000f0490 00024 (v02 ALASKA)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: XSDT 000000009d14d080 00084 (v01 ALASKA    A M I 01072009 AMI  00010013)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: FACP 000000009d157f88 0010C (v05 ALASKA    A M I 01072009 AMI  00010013)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: DSDT 000000009d14d1a0 0ADE7 (v02 ALASKA    A M I 00000210 INTL 20091112)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: FACS 000000009d170080 00040
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: APIC 000000009d158098 00072 (v03 ALASKA    A M I 01072009 AMI  00010013)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: FPDT 000000009d158110 00044 (v01 ALASKA    A M I 01072009 AMI  00010013)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: ASF! 000000009d158158 000A5 (v32 INTEL       HCG 00000001 TFSM 000F4240)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: SSDT 000000009d158200 00539 (v01  PmRef  Cpu0Ist 00003000 INTL 20051117)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: SSDT 000000009d158740 00AD8 (v01  PmRef    CpuPm 00003000 INTL 20051117)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: SSDT 000000009d159218 001C7 (v01  PmRef LakeTiny 00003000 INTL 20051117)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: MCFG 000000009d1593e0 0003C (v01 ALASKA    A M I 01072009 MSFT 00000097)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: HPET 000000009d159420 00038 (v01 ALASKA    A M I 01072009 AMI. 00000005)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: SSDT 000000009d159458 0036D (v01 SataRe SataTabl 00001000 INTL 20091112)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: SSDT 000000009d1597c8 03493 (v01 SaSsdt  SaSsdt  00003000 INTL 20091112)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: AAFT 000000009d15cc60 00475 (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)
Aug 22 21:24:48 beaufort kernel: [    0.000000] ACPI: Local APIC address 0xfee00000
Aug 22 21:24:48 beaufort kernel: [    0.000000] No NUMA configuration found


----------



## DanHansenDK

*STATUS 48 HOUR BURN-IN TEST:*

*24 HOUR CHECK!*
OK, it's now precisely 24 hours since the test system rebooted for some reason. I increased the fan speed yesterday, with 200 RPM's. So that they were running at 3000 RPM's (according to AeroCool temp./fan controller.). According to the specs for the PAPST fan, it should be able to run at more than 5000 RPM's. So I'm a little lost on this matter. Anyway, the test shows that for the last 24 hours, the system hasn't rebooted/crashed. Maybe it was a whole other thing that coursed this crash/reboot, I don't know. The system is not quite perfected yet, we know that. So we'll see after another 24 hours of testing. Here's the result's

Code:



Code:


# uptime
 21:20:20 up 23:55,  1 user,  load average: 4.61, 4.71, 4.79  <---- UPTIME 23 HOURS AND 55 MINUTES  ;)

# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 50 C
        Gpu                         : N/A
        Gpu                         : 48 C

*48 HOUR CHECK!*
OK, it looks like we did it. Here's the result of the 48 hour burn-in test:

Code:



Code:


# uptime
 21:49:59 up 2 days, 25 min,  1 user,  load average: 4.76, 4.76, 4.72

# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +62.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +61.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +58.0°C  (high = +80.0°C, crit = +100.0°C)

# nvidia-smi -a |grep Gpu
        Gpu                         : N/A
        Gpu                         : 46 C
        Gpu                         : N/A
        Gpu                         : 47 C
        Gpu                         : N/A
        Gpu                         : 50 C
        Gpu                         : N/A
        Gpu                         : 48 C

OK, there's a couple of issues still to be solved, but the system works now, and very good actually! I just received the "Top 5% average" from SETI, so we are doing something right







According to my calculation (not that accurate) running 3 system like this, we will get about 15-20.000 points per system. That's between 45-60.000 points a day. Let's say 50.000 in average, that makes about 1.500.000 points every month. That way we'll get in to "better society" pretty d... fast









*Back to reality! Issues needs to be solved.*
1. I'll test the new update 12.05.5 with CUDA 5.5. If it doesn't work, I've got a possible solution to the issues regarding 12.04.3/CUDA5.5 from a friend at HowToForge, "Srijan". Let's see how that goes








2. Solve the issues regarding the "headless" part.
3. Finish the sehll scripts, which is going to watch over the system and alert you if anything goes wrong.
4. Solve the hardware issues regarding the CPU fan-connector, connecting ASRock Z87 OC Formual/AeroCool Temp. & Fan Control. The RPM doesn't show in the display. Display has been tested of course!
5. Decide if FanControl by LM-sensors shall control system fans using the GPUFanWatchDog.sh and SystemFanWatchDog.sh scripts. Using software fan control can increase the fan temperature! So, if we choose to do so, we need a warnings system in case of a fan failure. Well, it's a warnings system we need it for, so this shouldn't be a problem, right








6. Check power consumption. Is this 550Watts 2U PSU really needed? I think not. And it's too d... expensive as well







I think we'll use about 250-300 watts in total. If this is right, then it's pretty good I guess. 1 powerfull CPU and 4 GPU's running at full speed. This is why I wnat the system to be without anything not needed.

_...more to come_

Tex, Magic !?!? What do you say?? Do you think it's because of the heat inside/on the graphic cards or is it because of a completely other thing???

*A PROBLEM:*
*Headless-Linux-CLI-Multiple-GPU-Boinc-Server_Problem-GPUs suddenly missing*
Installed a Ubuntu Server 12.04 and CUDA5.5 for number crunching/Boinc. Used the 12.04.3 update, since the 12.04.4 update doesn't work with CUDA!
System runs perfect, using all 4 GPU's to crunch data. Suddenly, without installing anything or updating anything, the GPU's is lost to Boinc!
It' might be better when testing this using the new 12.04.5 update and CUDA5.5, if it works at all that is!! We are done testing the fans/GPU's temerature, so it's time to solve the issues


----------



## DanHansenDK

Hello friends









I've been fighting this since early this morning, but I think we might just craked the case








We are now crunching data on a Ubuntu Server 14.04.1 LTS (newest version) using CUDA 6.5 (newest version) on all 4 GPU's and the CPU of course









I've started a Burn-In test again, using these new versions.

Status after 10 min. of testing at 100%:

Code:



Code:


# nvidia-smi -a | grep GPU
Attached GPUs                       : 4
GPU 0000:01:00.0
    GPU UUID                        : GPU-ea22ef3d-4254-dff0-2db8-86656441c198
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:02:00.0
    GPU UUID                        : GPU-bf213a08-c3c6-346b-53ff-5ff7d82c5c74
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:03:00.0
    GPU UUID                        : GPU-d5813be2-bf30-6c90-a591-90fef765984f
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 53 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:04:00.0
    GPU UUID                        : GPU-a9bb2423-c2ba-16cd-529f-cdcc43fafd61
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A


----------



## DanHansenDK

*STATUS:*
26.08.14 07:40pm
While testing I noticed some funny numbers in BoincTaks. The percentage of a job showed about 50% done, 12 minutes in to the job. That's a little to "positive"







OK, I have to say that the speed has gone up, using these new versions. I'm sure about that, but look at the image and the boxed area. Tell me what you think.
NB! The image/gpu output says "cuda55", but I promise you, this is CUDA6.5 from here: https://developer.nvidia.com/cuda-downloads#linux



*SOS! Please help me!*
There's another issue, which I would very much like to get solved right now! The problem is, that I can't use the AeroCool fan-control for the CPU! AeroCool works! It's the mobo ASRock Z87 OC Formula, which is somehow different than other mobo's! When connecting the CPU fan to AeroCool, there's an extra wire/connection for the motherboard. This because the system detects if a fan is connected. From AeroCool a wire runs from the controlunit to the CPU fan and then on to the motherboard fan connecter. The fan runs, but I can't see any RPM's and therefore not change the settings.
It's really really gets to me, because the mobo system is increasing and decreasing all the time. I'm getting crazy listening to that noise!








Anybody who has a suggestion on how to solve this issue, please let me hear. I know that I have to check/measure the power on a working mobo/Cpu fan/AeroCool and then compare it with the ASRock situation, but this is a nights work, I know it! So if you have an idea, please tell me. It doesn't matter how "increadible" it sounds... It may lead to something. And all you overclockers out there, I know you understand Cpu fans & fan controls


----------



## BritishBob

Hook it up directly to the PSU with inline resistors to change the speed to that that you want. Hook a small quiet fan to the CPU fan to avoid boot errors.


----------



## Tex1954

Umm, well, maybe I have an answer of sorts...

Some BOINC project WU's fail. Sometimes they create a system error. Some of them can NOT be stopped and started again because they improperly respond to terminate requests sent from the manager or whatever. You can even stop boinc all together and some WU's will keep on running in memory! Bad bad! Then you restart and suddenly have twice as many tasks running!

Some systems have an automatic update feature that needs to be disabled.

Umm, and the other thing...If you have a CUDA driver/SDK that supports CUDA 6.5, that is fine. It will support all CUDA versions 6.5 and lower. However, some projects use CUDA 23, 45, 55 drivers in their task code and that is what you are seeing. Just because your system supports CUDA 9.8 doesn't mean a WU will use that version.

Anyways, not sure what the aerocool problem is... maybe the BIOS ignores the tach because it doesn't respond to PWM commands.

Good luck!


----------



## DanHansenDK

Hello friends,

Quote:


> Hook it up directly to the PSU with inline resistors to change the speed to that that you want. Hook a small quiet fan to the CPU fan to avoid boot errors.


Hello BritishBob
Thanks.. That was another way of doing things







I certainly hasn't thought about that! Smart, pretty d... smart I have to say. It's just that I need to be able to control the RPM's and/or I need the feedback from the fan's. Anyway, it's a solution! And that's what I asked for









Quote:


> Umm, and the other thing...If you have a CUDA driver/SDK that supports CUDA 6.5, that is fine. It will support all CUDA versions 6.5 and lower. However, some projects use CUDA 23, 45, 55 drivers in their task code and that is what you are seeing. Just because your system supports CUDA 9.8 doesn't mean a WU will use that version.


Hi Tex,
Got it








Quote:


> Some BOINC project WU's fail. Sometimes they create a system error. Some of them can NOT be stopped and started again because they improperly respond to terminate requests sent from the manager or whatever. You can even stop boinc all together and some WU's will keep on running in memory! Bad bad! Then you restart and suddenly have twice as many tasks running!


OK... I see. Thanks again









*STATUS/TESTING:*
Ubuntu Server 14.04.1 & CUDA 6.5 / Running headless at this time. We are getting there










*
Crunching Nvidia GPU's/Linux 64bit platform/[email protected]:*
Anyone of you guys who know about progress regarding GPU's crunching on Linux 64bit platforms for [email protected] ??? Any development there? Haven't been able to find any "newish" apps. Anybody who has a working config file for it as well, to compare the one I made. Unsure about this


----------



## BritishBob

Quote:


> Hello BritishBob
> Thanks.. That was another way of doing things wink.gif I certainly hasn't thought about that! Smart, pretty d... smart I have to say. It's just that I need to be able to control the RPM's and/or I need the feedback from the fan's. Anyway, it's a solution! And that's what I asked for


Do you need software control or just some way to turn it down? You could put a variable resistor in there with values around what you want and just run some extra cable to the front. Drill a hole and have a variable resistor know sticking out the front/back of the case.


----------



## DanHansenDK

Hi BB,

I mixed stuff together, sorry about that! It was a way to get that fan to shut up while finding what's causing the problem with AeroCool







Actually I've got some "variable resistances" (potentiometer in danish. Just have to check the resistance/value. Think it's about 200Ω and according to my calculations the right value would be about 150-180Ω. I'll try this and use this solution until we've solved the issue regarding the missing feedback to AeroCool. It has to be a mobo thing and this mobo only, because the LCD display works, I've tested it and all the other mobo's (not the same model) works just fine.

Thanks for the help


----------



## DanHansenDK

Hello friends,

It has been a little while, but I'm back again. I've been fighting the "Ubuntu Server CLI v. CUDA" issue









This test system, is test system number 5 and we are going to use a new mobo this time. We are going for the Z97 chipset. And we are going to see the shell scripts tested. I've been making/finding a program/package so that we can send system notifications and alerts from the WatchDog* shell scripts. This program is "MSMTP" which is very easy to setup and works just perfect!!

Issues still to be solved:
1. FanController AeroCool X-Vision don't show RPM's in the display/Can't be controlled.

Power consumption all together for this system running at 100%. (1 x CPU at 100% & 4 GPU's at 100%):
Watts: 217









Here's the hardware I just ordered for test System 5 "Blenheim":

1x ASRock OC Formula Z97
Intel i5-4690K
Industrial 2U Cooler from JAC
Industrial 2U PSU ATX300W
4x Industrial Fan's Pabst
2x 4Gb Kingston HyperX Genesis X2 Grey S.
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16

Mobo: http://www.asrock.com/mb/Intel/Z97%20OC%20Formula/


----------



## Finrond

Think you might transition to single slot 750's anytime soon? They might be a better buy in the long run.


----------



## DanHansenDK

Hi Finrond









Quote:


> Think you might transition to single slot 750's anytime soon?


You'll have to "spell it out" for me







I'm not that hip









750 model/graphic card? mobo? Really interested in hearing from you again


----------



## Doc_Gonzo

I would check the RAM voltage on that kit. I think it's 1.65V. If so, I would change it for a 1.5V kit. I ruined the memory controller on a 2500K by running 4 x 1.65V sticks of Kingston HyperX RAM


----------



## DanHansenDK

Hi Guys,

I'm kind of "crashing" my own thread here, but I have an issue I'm pretty sure you can help me solve!

For the next "generation" of "Headless Linux Multiple GPU Boinc Servers" I've just bought a graphic card to test. Since there's no news regarding Low Profile cards and that I haven't found any Low Profile cards which is larger/better than Asus GT640, I decided that the next "round" of tests/building affordable crunchers, will be with normal size cards. And therefore I bought this Asus GTX 770 to test on a windows rig, just to see how it handless heat, how it does compared to the other cards and how much power it eats. If it's possible at all, to get a PSU that can supply 4 of those cards.
But, there's an issue. I'm really not very familiar with these "larger" graphic cards and especially not the newer ones. So I'm having trouble finding information on how to install these. Not on the mobo of course, but regarding the power connectors on the graphic card. There's 2 connectors, 1x 6 pin and 1x 8 pin and according to what I've read and learned, I must plug in PCIe power in both of them. Both the 6 pin and the 8 pin!?!? I'm using a Corsair 750 watts PSU which has separate wires/connectors and I have 2x wires with a 8 pin connector in one end and 2x 6 + 2 pin in the other end. Those 2x 6 + 2 pin connectors I can see is made so that you can connect them to both a 6 pin and a 8 pin connector.
I just would like, very much to get some kind of a confirmation on this, and that I'm suppose to connect power to both connectors on the card !? It says so here, but I would just hate to make a big mistake, and ruin a card which coursed an empty fridge in this month









Here's the connectors in question:



Hope to hear from you soon









***************************************************
No need for assistance - got it running now








***************************************************


----------



## Finrond

Yep, gotta plug in both connectors!

And dont worry, it wont run at full speed or wont boot if you dont plug in the pci-e power connectors, so you wont ruin the card.

As for my 750 comment, I was wondering if you were going to start replacing those 640's with something like a GTX 750. They have low-profile variants but they are dual slot.


----------



## DanHansenDK

Hi Finrond,

Quote:


> As for my 750 comment, I was wondering if you were going to start replacing those 640's with something like a GTX 750. They have low-profile variants but they are dual slot.


SAY WHAT!?!? It can't be true!?!? I've been looking and searching for a pretty g.. d... long time now!! Haven't managed to find anything which even compares to the GT640 from Asus !!! It endures heat very well (95 degrees C.) and is very very stable. Well I have some trouble, the system is not perfect, yet, but this is due to OS issues or at least I'm pretty sure it is. Please show me these cards!! If it's possible to get a low profile edition we can make a very special cruncher come together







The problem will be the 2U PSU, but that's just a thing to fix. No big deal








My head is already spinning. Thoughts flying away here!!! Regarding the dual slot issue, do enyone know the space which is between the sockets?? If they will fit into a 4x SLI board, we can only use 2 of course. But they will still be much faster than the 4x GT640 cards together..

That was great news!! I hope Asus has made a version of these LP cards









Looking forward to hear from you...

I've noticed that a SETI job done by the GTX770 only takes about 3-4 minutes, some even less. The same job (same size) takes 1 hour and 30+ minutes for one of the CPU cores to do. I'm not sure about the GT640 yet. But I guess they are faster than one CPU core (i5-4670K - i5-4690K)

Here's a few samples:

I've currently joined [email protected] & [email protected] and in this sample this is the samples shown. And it's the GTX770 we are looking at right now, not GT640 from test system 1-4.


The two Asteroid jobs is about 1.380.000 Gflops each. One is done by GPU/Cuda55. The other by one of the CPU cores on i5-4690K. As you can see both jobs is about 1.380.000 Gflops. The GPU/Cuda55 does this job in 22 minutes. The CPU core takes 1 hour and about 25 minutes to do the "same" job. With that in mind, let's look at the next two jobs, this time SETI.



These two jobs is smaller, but the difference between the time it takes the CPU cores to do the jobs is not that big. But, when it comes to the GPU there's a huge difference! This GPU does the "same" job as the CPU core in 7 minutes!!! I know that it's estimated jobsizes etc. etc. That some use Cuda55, some Cuda32/Cuda42 and that there's a lot of other reasons, that this is difficult to compare, but anyway. My G..!! 7 minutes!!! And I've even seen jobs been "chewed" in 3 minutes











OK, I'm looking forward to hear some noise, because I'm not no professor on these matters, that's for sure. If anyone knows about these things, don't hesitate to "teach" me a lesson









BTW, system 5 "blenheim" I should be able to put together on thursday or friday!! D... ! just remembered one thing I forgot! Forgot the AeroCool Fan Control/Sensor!?!?
Is there any of you guys who uses these fan controllers? I need to say that I've found a possible "weakness" regarding the AeroCool X-Vision Fan-Controller. Sadly the display gets dimmer along the time. When used 24/7 it seems that there's an issue with the display. I guess it's a component which can't take the heat (heat from its own components).
So, if you know a fan-controller with temp. sensors, capable of controlling 5 fans, please let me know. One you have used for a longer period of time









.


----------



## DanHansenDK

Hello









And then something like this happens! Suddenly it takes GTX770 almost 4 hours to do a single Asteroid job. Same size as before. And the CPU core does this in about 1 hour 50 minutes. So in this case the GPU is twice as slow as a CPU core. I just don't get this. There must be an error in the readings or those job sizes are very "estimated". Can't see what else can be wrong. It looks like, it's no good idea to crunch Asteroid jobs using GPU's in a windows environment. In Linux I see no such issues!

*What say you guys







*

GPU GTX770:


CPU core (i5-4690K):


Hi Finrond,
Quote:


> As for my 750 comment, I was wondering if you were going to start replacing those 640's with something like a GTX 750. They have low-profile variants but they are dual slot.


SAY WHAT!?!? It can't be true!?!? I've been looking and searching for a pretty g.. d... long time now!! Haven't managed to find anything which even compares to the GT640 from Asus !!! It endures heat very well (95 degrees C.) and is very very stable. Well I have some trouble, the system is not perfect, yet, but this is due to OS issues or at least I'm pretty sure it is. Please show me these cards!! If it's possible to get a low profile edition we can make a very special cruncher come together. The problem will be the 2U PSU, but that's just a thing to fix. No big deal.
My head is already spinning. Thoughts flying away here!!! Regarding the dual slot issue, do enyone know the space which is between the sockets?? If they will fit into a 4x SLI board, we can only use 2 of course. But they will still be much faster than the 4x GT640 cards together..

That was great news!! I hope Asus has made a version of these LP cards









Looking forward to hear from you...









.


----------



## Finrond

There is only 1 low profile 750 that I can find at the moment

http://www.newegg.com/Product/Product.aspx?Item=9SIA5751ZT1529

Seems a lil pricey to me, but hey, low profile!


----------



## tictoc

Here's a few more:

*GTX 750*
Galaxy
Gigabyte

*GTX 750ti*
KFA2
Gigabyte


----------



## DanHansenDK

Hi Finrond & Tictoc,

Thanks a lot guys!! Looks promising!
But, how come the GTX770 card from Zotac only have 500+ cuda cores? Is this because it's a Low Profile card perhaps???

Looking for a card which endures heat the same way Asus GT640 Low Profile does it







Must look a little better into the links you send me. Thanks for that my friends!

BTW! Got the hardware for the new test system14 days ago, but haven't had the time to put it together







University stuff! Trying hard to be "accepted" (what's the word?) into this final bit of education, which I really need!! I'm so G.. d.... nervous!! Well, what I wanted to say was, that I'm putting this 5 test system together right now! And this will be with the new 4x SLI mobo from ASRock with the Z97 chipset (Devils canyon) And we are going to deploy the "WatchDog*-scripts" on this system as well. No hardware fan & temperature controller on this system! We learned that there's to much difference between the onboard/sensors data/temp. and the output from these hardware fan/temp. controllers (in my case it was AeroCool X-vision). I rather preferred to have a hardware controller just to have the analogue output. An visible controller with alerts/sounds and blinking displays etc.
Anyway, there's some advantages using the mobo t control the fans. This mobo has 4 extra fan controls/connectors which works with onboard temperature sensors. This means we are able to control the fans with my "WatchDog*-scripts" and have the temperature of the system/onboard temperature "watched". This way it all comes together, CPU temperature control, warning, logging and safety shutdowns, GPU temperature control, warning, logging and safety shutdowns, Chassis temperature control, warning, logging and safety shutdowns, (HD temperature control, warning, logging and safety shutdowns - not in my case because I'm using SSD's) I'll make the script accessible of course, for those of you who use standard drives









Let's go!!!


----------



## DanHansenDK

Here's the hardware specs for test system 5:

1x ASRock OC Formula Z97
Intel i5-4690K
Industrial 2U Cooler from JAC
Industrial 2U PSU ATX300W
4x Industrial Fan's Pabst
1x 4Gb Kingston HyperX Genesis X2 Grey S.
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16
Asus GT640-1GD5-L PCIe 2.0 x16
GPU specs: http://www.asus.com/Graphics_Cards/GT6401GD5L/specifications/


YES!!! We are ready











*Power consumption all together for this system running at 100%. (1 x CPU at 100% & 4 GPU's at 100%):*
*Watts: 217*








Here's a tool you can use when trying to find the right size PSU. It's not completely hopeless! It's actually OK. It works "on the safe site", that's for sure, but you can use it to be guided in the right direction. Click here

Stuff's getting better:
1. This board is smaller! It's a standard ATX size and not as large as the Z87 edition. I guess it's because of the "missing" internal LCD display (which to me was a total flop). It may also be due to the fact that this mobo hasn't got the "water cooling kit" on the onboard GPU. But we don't need it, so no big deal









Issues still to be solved:
1. Implementation of my "WatchDog* shell-scripts"
2. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temperature or software issues??
3. Mobo Chassis Fan- & Temperature control.
4. 2U Industrial CPU Cooler bracket issue!

Links to the steps, which is not going to be changed. The way we install the fans, makes room for air (breathing holes) etc. etc.:
http://www.overclock.net/t/1467918/ubuntu-server-12-04-4-64bit-boinc-using-gpu-from-geforce-gt610-to-crunch-data/90#post_22734366
http://www.overclock.net/t/1467918/ubuntu-server-12-04-4-64bit-boinc-using-gpu-from-geforce-gt610-to-crunch-data/90#post_22741815





*
STATUS 28.10.14 09:15*
Great news!! Using this new mobo has unintentionally solved one of the issues from earlier on! Remember the "bracket" for the "2U Industrial CPU Cooler" !?!? I had to cut, grind and remove a lot of the material, to make room for the subsided components or component pins!! This is no longer an issue! It's PERFECT!! I hated to grind the bracket and make small metal spurs*. It's just not the best stuff to have in this kind of an environment









Here's how it looked on one of the earlier mobo's:







As you can see below, highligthed in red boxes, the component pins are not even close to the bracket! Happy days!! I have high hopes for this mobo, I really do! I really really hope that the 4 GPU's wouldn't course any problems! And that the "crash & reboot issue" from test system 4, is a software based problem!?!?





Allrighty-then







Let's continue this...


----------



## DanHansenDK

******A question for you guys ******

*Linux FileServer with RAID1 - double backup to 2 extra disks:*
_I've been building a Linux FileServer with RAID1 (Ubuntu Software RAID) and double backup function to 2 extra disks (NTFS & EXT4). This way backups are readable from windows boxes and Linux as well. All kind of system monitoring and warning systems. Hot-swap bays (green box) for RAID'ed disks, quick fix of a degraded array








If 1 disk "dies" you'll be warned about it. Then just plug in a new disk and the system gets "updated" System runs even though 1 disk is dead. In case of a double breakdown, the data will not be accessible of course, but data will be on the 2 backup disks. Both in NTFS and EXT4 file-system, so that data is accessible right away








Total step-by-step ToDo, from clean disks to system ready! Is this maybe something anybody inhere could use? It's more a ToDo and issue solving than a graphical "how to do it" !?!? Just let me know if someone needs a thing like this







_


----------



## DanHansenDK

*STATUS 28.10.14 1:24pm*

*Issues still to be solved:*
1. Implementation of my "WatchDog* shell-scripts"
2. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temperature or software issues??
3. Mobo Chassis Fan- & Temperature control --> x4 Papst Industrial Fans
4. 2U Industrial CPU Cooler bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [SOLVED]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)

*STATUS 28.10.14 3:36pm*
Task 6. Front-filter! These 2U Server Chassis are always "full of holes". That's the way they are designed! To have as much air as possible flow in the chassis, these 2U cases are full of ventilation holes. To keep a steady flow of air in the system, you'll place all fans in one direction. Usually you'll then install some kind of a filter in the front of the case. This wonderful 2U case has a special bracket in the front which can be fitted with a filter. Sadly there's not room for the bracket, because I'm using a 60cm Rack box and the case is 55cm. Not much space to work with







I'll find a way to mount a new bolt or I'll just use a screw to mount it. Then I'll just have to use a screwdriver when changing the filter, no big deal! We just have to be sure, that the airflow will keep on be as good as it is right now. It will be a problem, if there's not "access" to enough air. Then it wouldn't work, no matter how fantastic fans you install!!











*STATUS 28.10.14 4:29pm*
Fitting the 4 Asus GT640 graphic cards with the low profile bracket! BTW! Asus doesn't deliver your "low profile" graphic card with the brackets needed for using the card as a low profile card!! (Red box). This you'll have to order by itself. If only you were told about this, it wouldn't have been an issue. But when you'll have to open the box first, to know about this "missing" part, then it's a problem! I hate that kind of sales tricks. Because, that's what it is! Both Zotac ans MSI version of the GT640/610 comes along these brackets!



*Finish fitting stuff....*
Finish mounting CPU, memory and 2U CPU cooler on to the mobo! And please notice we doesn't need to grind and cut into the 2U CPU cooler bracket, which is a great thing! Hate to use powertools when putting computers together











*Fitting the Fans from Papst....*
Here we are going to fit the special industrial fans from Papst. For further details, look here please: Mounting the new fan's in the 2U chassis/case



STATUS 5:02
I'm looking so much forward to building this 5 system, because I think we've found the right mobo, the right set up and the right pieces! I'm really looking forward to implimenting my "WatchDog*-scripts" and make the chassis fans run using the 4 onboard chassis fanconnectors. It's perfect. 4 chassis fan connectors on this mobo and 4 is what we need!
Let's go!


----------



## DanHansenDK

*STATUS 29.10.14 7:46am*

*Issues still to be solved:*
1. Implementation of my "WatchDog* shell-scripts"
2. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temperature or software issues??
3a. Mobo Chassis Fan- & Temperature control --> x4 Papst Industrial Fans
3b. Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires is needed! [SOLVED]
4. 2U Industrial CPU Cooler bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [SOLVED]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)

OK! On with it









*Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires! (3b.)*
For those of you, who's not electronic engineers, here's how to do this "step-by-step". _Please remember, that all ToDo's, guides etc. will be gathered in the end._

*What do we need:*
Soldering iron and tin solder
3 pieces of wire, same size as with the fan (never mind the colour)
Heat-shrink tubing
Wire-cutter
Heater* (can't remember the right word for this)



*How to lengthen the Papst Fan wires - Step 1.:*
Cut the Papst Fan Wire, where you see fit! Remove insulation from all ends. Remove insulation from the 3 new wires as well.



*How to lengthen the Papst Fan wires - Step 2.:*
Now tin* all of the wire ends. Use soldering iron to heat up the and and apply tin solder.





*How to lengthen the Papst Fan wires - Step 3.:*
Now solder the first 3 wires to the 3 new wires. If you have the same sickness as me, then the wires has to be the right colour even though we are just lengthening them







I'll use red, yellow and black which is the normal colours used for these chassis fans. I'll connect "Blue --> Black", "Red --> Red" and "Yellow --> White". It's really not important, as long as you don't mix the wires











*How to lengthen the Papst Fan wires - Step 4.:*
Now cut 3 pieces of heat-shrink tubing, 2-4 cm long. Cover the soldered connections.



*How to lengthen the Papst Fan wires - Step 5.:*
Apply heat, shrink the heat-shrink tubing using the heater*



*How to lengthen the Papst Fan wires - Step 6.:*
Now do exactly the same with the 3 wires in the other end. Repeat step 1-5. *Important!* Remember to append the 3 heat-shrink tubings before soldering













*How to lengthen the Papst Fan wires - Finished.:*
It should be enough doing this with 2 Papst Fans. If you are going to use the same chassis as me, then it is











*STATUS 29.10.14 9:39am*
Let's go on with mounting the Papst Fans, then mobo & graphic cards











.


----------



## DarkRyder

nice guides. love the pictures man. thank you.


----------



## tictoc

@DanHansenDK This is an awesome project, and thanks for sharing the whole process with us.


----------



## DanHansenDK

Hi DarkRyder & TicToc,

Positive as always







Thanks friends!!

*STATUS 30.10.14 10:11am*
We've finished installing the new Papst Fans and because this mobo is a standard ATX, I'll guess we don't have to move the "frame" with the 4 fans. Let's see what happens.
BTW, the reason for the short delay from yesterday until today is, that I have been reading about architectures and which is to be preferred. From day 1 I have had the same plan. to build a "professional system" or as close as possible for less dollars. I wanted us to have the advantages of industrial and professional equipment, the results which follow but still it had to be affordable for "mortal" people like you and I. To show an example, when we started using 4 SLI (Nvidia's Multiple GPU) I first tested the very expensive card Z87 Extreme VI from Asus. Then I tried Z87 OC Formula from ASRock and now we have ended up using the Z97 OC Formula. The mobo from Asus was wonderfull! A really thick and robust circuit board, lot's of toys for engineers like pins and connectors for measuring and testing, the userfriendly BIOS interface from Asus etc. etc. But it was to expensive. It cost more than twice as much, as the mobo we are using right now. And the 4x SLI mobo from ASRock has got what we need! And another thing is, that the design of the mobo fitted the industrial 2U CPU cooler! It's the same thing with the CPU's. We don't just use the most expensive one. We'll choose a CPU which does the most for the dollar







And in our case, 8 threads are no advantage. OK, enough of the talking. Let's move on









*STATUS 30.10.14 10:48am*
Installing the 4 Fan frame, 2U PSU, mobo and the 4 GPU's


----------



## DanHansenDK

*STATUS 30.10.14 02:01pm*
)(/&%¤#"#¤%&/()... OK, a little hiccup! As with the Z87 mobo from ASRock, using multiple GPU's requires additional 12volt current. The only problem here is that the connector is placed in a "no so wonderfull" place! I'm trying to find a connector with a 90 degrees angle, but maybe it doesn't exist!?!? If not I'll have to develop one myself, but let's try to find one from the supplier. If one of you guys know, that this connector actually exists, please don't hesitate to share the knowhow








if only the connectors were place like on the image below (Top image), then there would be no problems! But of course the connector has to be place right in the edge of the mobo and then mounted horizontal as well (Red box). Way to go ASRock







. No, really... I can see a lot of advantages in mounting the connector like that!! NOT!!





*STATUS 30.10.14 15:28am*
OK, I've emailed my component supplier about the issue. We'll see if they can find such a connector. 90 degrees angle, female. In the meantime I'm trying a couple of things. First I'm making a home-made connector, just in case the 90 degrees connector does not exist! I'm taking the 4 pins from a connector and then mold them together using a 2 component glue. This is not a pretty solution, so I've just had the mobo up and looked at it. It's really not a big deal to replace the onboard connector with a vertical connector instead of this "no so great" horizontal connector!!! Of course all warranty will disappear faster than you can say "ItWasn'tMeWhoDidThat"







But that's life in the fast lane







I'll show how to make the connector and I'll show the vertical connector as well.

Here's the "vertical" connector. This is the one (in the red box) that's usually used on circuit boards. As you can see on the drawing (in the little red box) it's the same one. Nut on the mobo they chose a horizontal version instead!?!?



*Making your own home-made connector:*
In case you haven't got the connector Mr. TicToc has shown below, this is how you make your own connector. You can use it before molding it with 2-component glue.:

*What do we need:*
Soldering iron and tin solder
4 pieces heat-shrink tubing
1 female connector with wires from old PSU or T-wire (cut & forget)
1 male connector as moulding form.
Wire-cutter
Heater* (can't remember the right word for this)
Oil (oil for sowing machines will do)
A 2-component glue

*Making your own connector - Step1:*
Take a standard 12volt large connector (female) from any old PSU or T-wire etc. Demount the 4 pins. You may have to cut into the plastic to be able to "press in/remove" the stop on each pin. Maybe you can do it with a very little screwdriver. Be sure to "press in/remove" the stop on every pin before pulling out the wire/pin. If you want to you can bend the metal a little and then the wire to make a 90 degree angle. Be careful not to break it. Don't bend and then straighten out!!



*Making your own connector - Step2:*
Now add 4 heat-shrink tubing to each wire/pin. Cover as much of the pin/metal as possible:



*Making your own connector - Step3:*
Apply heat on the heat-shrink tubing to shrink it











*Making your own connector - Step4:*
Use a "robot/holder" if you have one. Now find a "male" connector and attach each pin in the right order. You can see it on the image, but the right order is red, black, black and yellow. If you want to mould it continue to step 5. If not, just apply strips to secure the wires a little. (Please be careful! Check wires one more time. Has to be in the same order as the connectors on e.g. the PSU). Solder a "male" connector in the other end so that you can attach it to the PSU. Apply heat-shrink tubing to the 4 soldering's.



*Making your own connector - Step5:*
Moulding a connector. Use a paintbrush or something like that to apply a layer of grease or oil. Oil for sowing machines will do just fine! Apply oil, not to much, but it's important every part of the "form/connector" has got oil on it. Find your 2-component glue and mix the 2 components together. Pour the glue into the connector. E.g. you can use a syringe, fill it up with the mixed glue and press it out into the "form/connector". Wait for the glue to harden, and then pull it out of the form. Solder a "male" connector in the other end so that you can attach it to the PSU. Apply heat-shrink tubing to the 4 soldering's.



*Making your own connector - Additional information:*
Some technical specs.




.


----------



## tictoc

You can use one of these connectors. Easy to install just clip off the standard connector, cut the wires to length, and then press them onto the connector. Very handy for getting into tight locations.



http://www.frozencpu.com/products/7616/ele-345/Standard_4_Pin_Low_Profile_Easy_Install_Connector_w_Cover_-_Black.html?tl=g44c155s642


----------



## DanHansenDK

TicToc.. That's Mr. TicToc









You just made my day!! My G.. you found it!! Can't get my hands down! I'm pretty worked up right now









Well, if there's any way I can get a bagful of those connectors I would be most grateful! Where did you find those? I search the net and spent an hour looking through my 1500 pages book of parts. Couldn't find anything!
As you know I'm from the cold Scandinavia. Most of the stuff you can find in an TV-repair-shop over there just don't exists here ;( Crossing fingers right now and looking forward to hear from you again









Thanks for the quick reply BTW









OK! Just noticed the link now and the shop seems to ship all the way to Viking-Land







Can't get my arms down


----------



## tictoc

Not sure where you can get these in Europe, but I believe that FrozenCPU does ship internationally, if you can't find them elsewhere.


----------



## msgclb

Do a search for

*mod/smart 4pin Muti-Use Molex 90° Connector - Black*

Although it's out of stock check *here*.


----------



## DanHansenDK

Hi TicToc....

Thanks.. Yes they do ship international. And for only 7 dollars! It then takes about 14-30 days, but that's OK with me








I'm so thankful for this, because it would have made the project a bit more "technical". And I'm trying to keep it "doable"








I'll try to order a handful right away!

Hi msgclb...

Thanks for your input







That's actually right "next door". Maybe I can get it from there when it comes in stock again. UK is a bit closer to me, that's for sure. I'll check it out. Thanks for your post!

*STATUS 31.10.14 2:07am*
OK... System is put together and I'm currently installing Ubuntu Server 14.04.1 and CUDA 6.5 (.deb). This worked on test system 4 (Beaufort) and it looks good so far.
As promised I'll upload images illustrating how to make the connector yourself. I'm using one right now







I'll upload it in post "125"









Issues still to be solved:
1. Implementation of my "WatchDog* shell-scripts"
2a. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temperature or software issues??
2b. Trying to: "cp /etc/X11/XF86Config /etc/X11/xorg.conf" ---> WARNING: cp: cannot stat '/etc/X11/XF86Config': No such file or directory
3a. Mobo Chassis Fan- & Temperature control --> x4 Papst Industrial Fans
3b. Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires is needed! [SOLVED]
4. 2U Industrial CPU Cooler bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [SOLVED]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)
9. 4x GPU's requires 12volt connector on mobo. Find 90 degree angle connector in a store or make connector [SOLVED]

*STATUS 31.10.14 3:38am*
Happy days








Now crushing data for [email protected] & [email protected]
Performing 24 hour "Burn-In test"

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A



*STATUS 31.10.14 10:08am*
7 hours and ongoing.

Code:



Code:


# uptime
 10:09:39 up  6:54,  1 user,  load average: 4.65, 4.80, 4.84

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

*STATUS 31.10.14 10:30am*
Let's look at a couple of images from the installation while testing. When installing a system like this, it's not always enough to buy and install great hardware. Sometimes we have to do some "footwork".

*Where to connect the 4 Papst Fans on the mobo:*
First I'll show where to connect the 4 Papst Fans. As you know we are not using the Aerocool Fan- & Temp.control for this system due to several causes. Read about it a few posts back. Instead we are going to use the 4 onboard chassis fan connectors. This will make it possible for us to use another of my shell-scripts "WatchdogFanControl.sh".



*Keep it nice and in an orderly fashion:*
When installing the hardware it's important to keep all the wiring nice an in order. As you can see (red box) the power cables has been assembled using strips*. The SATA cables (green box) to and all placed in the top of the chassis. (should have made an image of that, sorry!) It's all horizontal! You may have a laugh about this but actually I noticed a difference between the way this is done and before it was done. I guess loose wiring will make, what's the word from aviation now, I should know... "turbulence"







All not used cables are kept out of the way! As you can see I put it through the opening between the fans (blue box). This way we keep as little material in the space where air will flow! Remember how much difference there was in temperature on GPU 4 before we changed fans!?



*Keep the space clear for stuff:*
It's important that we keep as much space as possible "free" for the air to flow (green arrows). And as you can see (blue box), the cables which is not in use fitted perfectly in the empty bay on top of the DVD/CD-Rom











*Try to avoid this







*



.


----------



## DanHansenDK

*STATUS 31.10.14 10:59am*
I just noticed that test system 4 aka "beaufort" has stopped running it's 4 GPU's !! And I can't get in contact with "nvidia-smi". I'll guess this has to do with the the newish "Kernel 3.13.0-32-generic" as written under "Issues still to be solved". Or maybe it's because of the missing 4-pin power-connector for "PCI-e" (the one we just made ourselves and Mr. TicToc found for us online). I'll reinstall the system now and then we'll have ourselves a "control group". Very scientificly correct









.


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> *STATUS 31.10.14 10:59am*
> I just noticed that test system 4 aka "beaufort" has stopped running it's 4 GPU's !! And I can't get in contact with "nvidia-smi". I'll guess this has to do with the the newish "Kernel 3.13.0-32-generic" as written under "Issues still to be solved". Or maybe it's because of the missing 4-pin power-connector for "PCI-e" (the one we just made ourselves and Mr. TicToc found for us online). I'll reinstall the system now and then we'll have ourselves a "control group". Very scientificly correct
> 
> 
> 
> 
> 
> 
> 
> 
> 
> .


Good luck, hope you get it back up and crunching in time for the BGB


----------



## DanHansenDK

Hi Finrond







Quote:


> Good luck, hope you get it back up and crunching in time for the BGB


I'm not sure about a lot of those Boinc Groups/Team Stuff etc. But I can guess that BGB means "Boincers Gone Bonkers" as written in your signature







But according to that, it's in July!?!? Anyone who will tell me a little about those projects and how to be a part of it!?

I've been posting quite a lot at "[email protected]/Linux" & "[email protected]/Number Crunching" regarding the use of to be able to choose and configure all the systems to do GPU crunching for SETI as well. But I haven't got the right skills yet, because I had a lot of hard criticism. So for now, using systems from this project, we can't use our GPU's to crunch for SETI. I'll add this to the "ToDo - Issues to be solved list" because we are of course going to support SETI as well







I know it can be done. It can be done setting up the app_config.xml to use app's from here (look below), but we are entering a dangerous area here. And I'm not going to make a ToDo or show you guys a way which ruins your computer! Well, after implementing the "Watchdog* Shell-Scripts" we can continue this talk:
http://lunatics.kwsn.net/index.php?module=Downloads
http://www.arkayn.us/forum/index.php?PHPSESSID=e29af30d394c458057ff1312400460e7&action=tpmod;dl=cat5

*Status "Test System 5" aka "Halifax" - 01.11.14 03:43am:*
Ideas - is it a good one?:
A. We could make a script controlled by CRON, which "paused" the running jobs for, say an hour or maybe just 15 minutes!?!? To give the system a "little break". The question is if this could cause new issues. Will the system be better of running 24/7 or would it help and protect it when adding a "daily break"








Issues still to be solved:
1. Implementation of my "WatchDog* shell-scripts"
2a. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temp./software issues? [TESTING]
2b. Trying to: "cp /etc/X11/XF86Config /etc/X11/xorg.conf" ---> WARNING: cp: cannot stat '/etc/X11/XF86Config': No such file or directory
3a. Aerocool X-vision can't control/show CPU Fan using ASRock Z97/Z87 mobo's! All Aerocool X-vision is now dismounted. [SOLVED]
3b. Mobo Chassis Fan- & Temperature control --> x4 Papst Industrial Fans, run tests to find the right config.
3c. Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires is needed! [SOLVED]
4. 2U Industrial CPU Cooler subside bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [SOLVED]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)
9. 4x GPU's requires 12volt connector on mobo. Find 90 degree angle connector in a store or make connector [SOLVED] [ORDERED]
10. Configure Systems to crunch using GPU's for SETI (Seti Enhanced? AstroPulse?). Non SETI standard App's does Linux/GPU's. Implementation of "Applications" --> app_config.xml / cc_config.xml
11. Network adapter issue. Ethernet names not consistent ubuntu server 14.04. # ifconfig --> "p3p1".

*Reinstalling "Test System 4" aka "Beaufort":*
Had to take a few hours







Now reinstalling "Test System 4" aka Beaufort









*Status "Test System 5" aka "Halifax":*
After that I'll work on my Watchdog* Shell-scripts







Just have to find the right mobo/BIOS setting/configuration for the 4 chassis Fans







It's just so very cool that there's 4 chassis connectors on this board (image1 - red box'es). Well, the Asus mobo Extreme VI had it too, but when using a mobo which cost less than half, I think it's lucky those were not one of the things to be "cutback". As you can see on the image (image2 - green arrows) we are now able to control the airflow in each area of the chassis. It's a very good thing, yes it is!
BTW!! Some of you may have noticed that the 2U Industrial CPU Cooler has been "turned" 180 degrees, again! To begin with I installed the fan this way,. But after "feeling" or looking for the airflow from the fan, I noticed that most of the air came from the "fan site" of the cooler (the site where the fan is mounted on to the cooler). So I turned the cooler 180 degrees and then I was pretty sure everything was fine. But after the running the temp. test again, I noticed the increasing temp.!?!? Therefore I tested the fan one more time using a low current to see which way the fan blades actually turned. And of course I had got it all wrong. The fan is blowing air towards the heat sink and not away from it







So I turned the cooler one more time and then the temp. went down again! It wasn't a lot, just a bit. I think it was 3-5 degrees, but anything counts here!





*STATUS 01.11.14 05:17am:*
Status for test System 5 aka "Halifax". Burn-in test 24hour done! Let's "close it up" and go for 48hours shall we







Let's see what happens with the temp. This is the critical moment. Closing the chassis! There really isn't that much space in there.









*24hours open - Burn-in test:*
After 24hours this is the result. Has been running without any issues. System chassis open.

Code:



Code:


# uptime
 05:18:32 up 1 day,  2:03,  1 user,  load average: 4.91, 4.84, 4.85

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +50.0°C  (high = +80.0°C, crit = +100.0°C)

*24hours closed - Burn-in test:*
Chassis will now be closed, monitored and tested for another 24hours. Hourly checks, a little more in the beginning








_ongoing work...._

Status after 5 min.:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +52.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 10 min.:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +50.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 20 min.:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +51.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 30 min.:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 1 hour:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +49.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 2 hours:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +50.0°C  (high = +80.0°C, crit = +100.0°C)

Status after 4 hours:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +50.0°C  (high = +80.0°C, crit = +100.0°C)

*24hours closed burn-in test - 8 hours:*
Status after first 8 hours running closed. No crashing so far! 2 GPU's a little high according to my standards and other tests. But still under 50 degrees Celsius which is pretty great, considering GPU's are running at 100% 24/7/365 !!! The CPU is doing great as it always is when using the Industrial 2U CPU Cooler









Code:



Code:


# uptime
 14:30:21 up 1 day, 11:15,  1 user,  load average: 4.93, 4.94, 4.87

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +51.0°C  (high = +80.0°C, crit = +100.0°C)

*24hours closed burn-in test - 11 hours:*
OK, I would like the temp. to go down just a bit on a couple of the cards. But, we haven't started configuring the 4 chassis fan controls yet! Right now they are running "standard" setting in the BIOS. When 12 hours has passed I'll try to set them to "Full speed". Since we are no longer using the Fan Controller from Aerocool, the plan is to make a custom configuration for each chassis fan/setup in BIOS and then have the Watchdog* Shell-Scripts watch them. It's possible to control some settings through the scripts as well, but we will have a master plan/confirguration for the 4 chassis fan controls in BIOS. And it'll take some time to find the right setup. The idea is to have a secure custom setup for the 4 chassis fan in BIOS and mostly use the Watchdog* Shell-Scripts for monitoring. I don't like the fans to keep changing speed. When running as they do now, they keep increase and decrease rpm and it's not very suitable for a system like this. There's to many heat-sources and sensors for the system to run steady. So the plan will be something like this. All chassis fans will run at 80%. If chassis sensor 1-4 exceeds temp. e.g. 55 degrees Celsius increase speed to 90%. Critical temp. is e.g. 65 degrees Celsius. Nice and simple! It's possible we'll only use 1 limit for increasing fan speed and the critical limit. Depends on the test results









Code:



Code:


# uptime
 17:28:20 up 1 day, 14:13,  1 user,  load average: 4.85, 4.79, 4.83

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +48.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +48.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +46.0°C  (high = +80.0°C, crit = +100.0°C)

*24hours closed burn-in test - 24+ hours!!:*
Done!! Test is complete and it seems to be a stable system. After running 24 hours opened and 24 hours closed, the GPU's and the CPU of course, seems to keep steady temperatures. When configuring the chassis fans control settings has been done, they'll surely drop a few degrees.

Code:



Code:


# uptime
 08:16:25 up 2 days,  5:01,  1 user,  load average: 4.71, 4.78, 4.77

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +49.0°C  (high = +80.0°C, crit = +100.0°C)

BTW! Have you noticed the fairly low temperatures for some of the CPU cores!?!? And this is obtained using an "Air/Fan Cooler", not liquid. CPU running at 100% 24/7







It certainly helped turning it the right way, keeping it blowing air in the same direction as the other fans









Core 0: +55.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: *+51.0°C* (high = +80.0°C, crit = +100.0°C)
Core 2: +52.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: *+49.0°C* (high = +80.0°C, crit = +100.0°C)

*24hours closed burn-in test - Test Ended!:*

.


----------



## DanHansenDK

*Reinstalling "Test System 4" aka "Beaufort":*
Beaufort has been shut down for maintenance and is of the grid. Reinstalling according to ToDo version:

Code:



Code:


HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: Ubuntu Server 14.04.1 64Bit
KERNEL: 3.13.0-32-generic
CUDA: CUDA 6.5
BOINC: v.7.2.42
TODO: v.1.1.5 31.10.14 09:15:00

*
STATUS "Test System 4" aka "Beaufort" - 01.11.14 08:47am:*
Reinstallation using the ToDo went pretty swift







But, there's still spooks around. Of course it's because of the missing power for the "PCI-e's". But I got it to run last time without the powerline. Sadly the manual for ASRock Z87 OC Formula said something completely else! I know, because I have studied it very closely! Hear you are told to use connector if you are using 2 graphic cards. Not "more than 2 cards", no only when "using 2 cards"! Then after reading the manual for the new Z97 mobo (Test System 5) it suddenly reads loud and clear:

*ASRock Z97 OC Formula Manual page 25.:*
_PCIe Power Connector. Please connect a 4 pin molex power cable to this connector when more than 3 graphic cards are installed._
Thanks a f...... lot









Running test commands also suddenly reveals the issues:

Code:



Code:


# nvidia-smi -a
Unable to determine the device handle for GPU 0000:04:00.0: Unknown Error

OK, maybe I had an idea of the issue causing this. Of course I did. But it would have been so much easier if the info for an important matter like this hadn't been equivocal. I'll just make another power cable and connect it. Let's see if this solves it. I'm pretty sure it does. But it would have been nice to recreate the same situation/the same installation to find the exact reason to the crash, right???

*STATUS "Test System 4" aka "Beaufort" - 01.11.14 5:16pm:*
Just finished the "home-made" connector! Let's install it in "Test System 4" aka "Beaufort" and see if the problem is solved. BTW here's some nice stuff to look at while installation continues. This was the first output to the monitor after making a clean installation earlier on. _Please notice that I had to write this screen info down on paper, so the numbers "x.xxxxxx" aren't all accurate_:

Code:



Code:


Login: [9.665205] nouveau E [ PIBUS ][0000:04:00.0] HUBO 7.6545564
Login:  [8.545564] nouveau E [ PIBUS ][0000:04:00.0] GPC0 6.4556465
5.265564 systemd-udevd [1051] failed to apply ACL on /dev/dri/card3 : no such file or directory

*STATUS "Test System 4" aka "Beaufort" - 02.11.14 8:43am:*
Installing PCI-e connector and making a clean installation of Headless Linux CLI Multiple GPU Boinc Server --> Ubuntu Server 14.04.1/CUDA6.5
I'm hoping to see the issue described above go away, right after installing the connector. I'll test it before re-installing the system







It certainly sounds like a possibility, right?:

ASRock Z97 OC Formula Manual page 25.:
_PCIe Power Connector. Please connect a 4 pin molex power cable to this connector when more than 3 graphic cards are installed._

Reinstalled System Output - No PCI-e connector attached:
_"...systemd-udevd [1051] failed to apply ACL on /dev/dri/card3:.."_

*STATUS "Test System 4" aka "Beaufort" - 02.11.14 10:42am:*
BTW! Aerocool X-vision desmounted on this system as well!! Doing this automatically solved the issue with control/view the CPU Fan on ASRock Z97/Z87 mobo's









*STATUS "Test System 4" aka "Beaufort" - 03.11.14 3:21pm:*
Sadly "Beaufort" has to be replaced. I tried 673 things, but I can't solve the issue. I replaced all hardware on the mobo, testet the same hardware, CPU, graphic card, memory etc., on another mobo without being able to create the same error. The mobo just crashes and reboots. Will not run more than a few hours before it crashes. I've written my supplier and ordered a replacement mobo. But I've decided to replace it using the new Z97 board. Why? Because I'm pretty sure this mobo is floret. I've had nothing but trouble with this mobo. At the same time I don't like that I have to work on CPU-cooler to make it fit. With the Z97 there were no problems. Here the bracket fitted just fine. As you properly remember it was the same thing with the very expensive mobo from Asus. (Extreme VI) and other Asus cards. But I managed to find a mobo (Z97 OC Formula) which can be used without the need to use powertools and which has a normal size, ATX. The other mobo's had funny sizes. They were a little longer, 2-3 cm. Not much, but enough to ruin the plan. As you properly remember we had to move the bracket which kept the 4 chassis fans in place.
Therefore test system 4 will be replaced and used as a control unit to test system 5. 2 exact same systems, with the exact same software and configuration.

Test system 5 aka Halifax has been running perfectly since we started the test! No crashes and no issues. I'll get back to you when I'm ready with the new hardware. Then we'll continue with the implementation of the shell-scripts. Meanwhile I've been solving the issue of sending system mails without the need of installing a complete mailsystem. We are going to use the very nice little piece of software called "MSMTP"









See ya later on...









*STATUS "Test System 4" aka "Beaufort" - 08.11.14 8:25am:*
YES!! Hardware has been replaced! A very good supplier I've got.. Replacing the faulty hardware BEFORE receiving the returned hardware







That's my best supplier ever!! Replacing the hardware on Test System 4 aka Beaufort. Stand by









BTW! Hello friends








Hope you'll have a nice weekend









.


----------



## DanHansenDK

*STATUS "Test System 5" aka "Halifax - 08.11.14 11:58am:*
Test System 5 aka Halifax runs perfectly!! Since I restarted it in the beginning of the week, it has been running perfectly. And please notice the temperatures!! They are very low. Lower than we've been able to accomplish before. I think we've got it now. I think we've found the recipe for this the first "Headless Linux CLI Multiple GPU Boinc Server". Version 1.05 of the _economic semi-pro super cruncher_









*STATUS "Test System 4" aka "Beaufort" - 08.11.14 12:14am:*
When Test System 4 aka Beaufort has been assembled, It'll be installed in the rack to begin testing inside the rack. This is the last "issue" to check/solve. I'm not sure if it'll be a problem, but we'll have to test it. According to my calculation, temperatures will rise a bit, but nothing serious. So, let's see what happens. When installed in the rack, I'll move on to the shell-scripts on Test System 5 aka Halifax. We got a little delayed because of the hardware issue on Test System 4 aka Beaufort, but now it seems that we are back on track









Running at 100% 24/7/365








Code:



Code:


# uptime
 11:58:40 up 5 days, 19:32,  1 user,  load average: 4.59, 4.77, 4.81

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +51.0°C  (high = +80.0°C, crit = +100.0°C)



*STATUS "Test System 5" aka "Halifax - 09.11.14 09:24am:*
Problems with the upload server at [email protected] This you can see at the GPU temp. They've dropped remarkably. But the system itself keeps on running nicely. Almost 7 days at 100% without any issues. This is looking really good! The nice and low CPU temperatures really shows us that this system is running well. The CPU is working at 100% 24/7 and it keeps* at a steady 50 degrees Celsius. Using a pretty compact fan cooler I think we can call this part of the system a success









Code:



Code:


# uptime
 09:20:17 up 6 days, 16:54,  1 user,  load average: 4.10, 4.17, 4.15

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 30 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 29 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 29 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 27 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +47.0°C  (high = +80.0°C, crit = +100.0°C)

.


----------



## DarkRyder

looks real nice man


----------



## DanHansenDK

Hello DarkRyder









Thanks man.... BTW do you support/have you joined [email protected] ??? In case you do, have you had any problems in this weekend? Haven't been able to upload finished work units for 3 days now!?!? [email protected] has just been "down" for 2-3 days or the upload server "bruno" has!! Wasn't this exactly those days your project ran??? The "boinckers gone......." thing???

*STATUS "Test System 4" aka "Beaufort" - 09.11.14 09:16am:*
Replacement of mobo on Test System 4 aka Beaufort is almost done. Looking forward to start the testing of the shell-scripts







Let's hope the [email protected] upload-server will soon be online again, so that we can deploy and connect this "twin" system. Make 4 more GPU's run









*STATUS "Test System 4" aka "Beaufort" - 09.11.14 10:30am:*
Done!! Let's begin installing Linux Kernel 3.13.0-32-generic & CUDA6.5 to make them GPU's sweat








I'm currently at ToDo: v.1.1.5. which will be modified again today when implementing the shell-scripts







This will of course be at Test System 5 aka Halifax, because Test System 4 aka Beaufort (this system) will have to be tested. We'' make a 2x 24 hour test, without and with the "hood" down









.


----------



## DarkRyder

i have ran asteroid in the past sure, i have never ran anything on linux tho.. i'm sorry


----------



## DanHansenDK

Hi DarkRyder.

It's all right my friend







I asked mostly because of the current problems occurring at [email protected] There seems to be some issues with the upload/download of new/finished work-units.

*Status "Test System 6" aka "Stirling" - 10.11.14 00:48am:*
NB! Next system will be a test system that we'll keep developing! This will be the system which take's us to the next level. If this "next level" system will be something completely else or much more expensive, we'll call it "Headless Linux CLI Multiple GPU Boinc Server - PRoVersion" or something like that. Because we just might have to implement water cooling. I'm not sure. And then we'll have to use the larger PSU which is a bit more expensive. OK, we'll see about that. The hardware for this Test System 6 has been ordered! System will be called "Stirling". When we've finished testing Test System 5 & 4 and solved the list of issues, all 5 test systems will be modified into containing the same hardware. I'll soon make a note/a list showing which hardware parts has been chosen. Chosen to stay permanently in this Headless Linux CLI Multiple GPU Boinc Server - v.SemiPro/version1.









*Status "Test System 5" aka "Halifax" - 10.11.14 02:38am:*
Ideas - is it a good one?:
A. We could make a script controlled by CRON, which "paused" the running jobs for, say an hour or maybe just 15 minutes!?!? To give the system a "little break". The question is if this could cause new issues. Will the system be better of running 24/7 or would it help and protect it when adding a "daily break"








Issues still to be solved:
1. Implementation of my "WatchDog* shell-scripts"
2a. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temp./software issues? [TESTING]
2b. Trying to: "cp /etc/X11/XF86Config /etc/X11/xorg.conf" ---> WARNING: cp: cannot stat '/etc/X11/XF86Config': No such file or directory
3a. Aerocool X-vision can't control/show CPU Fan using ASRock Z97/Z87 mobo's! All Aerocool X-vision is now dismounted. [SOLVED]
3b. Mobo Chassis Fan- & Temperature control --> x4 Papst Industrial Fans, run tests to find the right config.
3c. Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires is needed! [SOLVED]
4. 2U Industrial CPU Cooler subside bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [SOLVED]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)
9. 4x GPU's requires 12volt connector on mobo. Find 90 degree angle connector in a store or make connector [SOLVED] [ORDERED]
10. Configure Systems to crunch using GPU's for SETI (Seti Enhanced? AstroPulse?). Non SETI standard App's does Linux/GPU's. Implementation of "Applications" --> app_config.xml / cc_config.xml
11. Network adapter issue. Ethernet names not consistent ubuntu server 14.04. # ifconfig --> "p3p1".

*Status "Test System 4" aka "Beaufort" - 10.11.14 02:49am:*
Installing Linux Kernel 3.13.0-32-generic & CUDA6.5 - ToDo v.1.1.6. The ToDo is due to all the issues still very "complicated". This is the only reason I'm not presenting it. It fills a lot too. Anyone who need to use it, just say it and I'll list the newest version.

HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: Ubuntu Server 14.04.1 64Bit
KERNEL: 3.13.0-32-generic
CUDA: CUDA 6.5
NVIDIA: v.340.29
BOINC: v.7.2.42
TODO: v.1.1.6 10.11.14 01:17:00

*Status "Test System 4" aka "Beaufort" - 10.11.14 03:52am:*
Happy days!!! CUDA6.5 installing. takes about 10minutes when using the .deb (debian package) and the system is just working hard. Love it!! Just love it









Code:



Code:


# wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_6.5-14_amd64.deb
# dpkg -i cuda-repo-ubuntu1404_6.5-14_amd64.deb
# apt-get update
# apt-get install cuda-6-5



*Status "Test System 4" aka "Beaufort" - 10.11.14 04:26am:*
System hardware replaced! Linux reinstalled! CUDA6.5 installed! System requirements for headless rendering installed!
Let's install the BOINC-client (Only the client because this is a CLI based system). Dash will even be replaced with Bash in the next test









Code:



Code:


# uptime
 04:25:06 up 23 min,  1 user,  load average: 0.04, 0.03, 0.03

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 29 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 30 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 30 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 28 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +32.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +28.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +27.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +29.0°C  (high = +80.0°C, crit = +100.0°C)

*Status "Test System 4" aka "Beaufort" - 10.11.14 06:27am:*
System up and running. Crunching 4x GPU's and 4x cores/CPU.

Status 24 hour test - "open hood" - 15 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 42 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +50.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "open hood" - 30 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 42 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "open hood" - 1 hour:

Code:



Code:


# uptime 05:14:41 up  1:12,  1 user,  load average: 5.44, 5.05, 4.41

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 42 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "open hood" - 4 hours:

Code:



Code:


# uptime 08:15:28 up  4:13,  1 user,  load average: 5.44, 5.05, 4.41

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 42 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "open hood" - 8 hours:

Code:



Code:


# uptime 12:05:57 up  8:04,  1 user,  load average: 5.13, 5.01, 4.94

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 41 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +55.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "open hood" - 12 hours:

Code:



Code:


# uptime
 15:51:28 up 11:49,  1 user,  load average: 4.94, 4.90, 4.93

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 41 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +56.0°C  (high = +80.0°C, crit = +100.0°C)

.


----------



## DanHansenDK

*Status "Test System 4" aka "Beaufort" - 10.11.14 01:43pm:*
System is now installed and configurated!! Test has been running for 9 hours+ . But, I'm still having a lot of issues on communicating with [email protected] !! Anybody who has the same problems??? Downloading jobs wouldn't work, uploading either... Can't refresh website... I've been trying to read about it online, but I can't because I'm not able to enter the website /forum!?!? This has been an issue for wuite some time now. It comes and goes, therefore I've just told myself, that it's due to a "bad" internet connection. But I'm having no problems connecting to e.g. SETI. Microseconds!!













Here's a sample from earlier on:

244 [email protected] 10-11-2014 06:04:08 Requesting new tasks for NVIDIA
243 [email protected] 10-11-2014 06:04:08 Reporting 9 completed tasks
242 [email protected] 10-11-2014 06:04:08 Sending scheduler request: Requested by user.
241 [email protected] 10-11-2014 06:04:06 update requested by user
[...]
237 [email protected] 10-11-2014 05:41:04 Starting task ps_141020_326060_4_1
236 [email protected] 10-11-2014 05:41:04 Starting task ps_141020_326060_3_1
235 [email protected] 10-11-2014 05:41:04 Finished download of libcudart.so.5.5
234 [email protected] 10-11-2014 05:40:39 Giving up on download of input_325549_6: permanent HTTP error
233 [email protected] 10-11-2014 05:40:37 Started download of input_325549_6
232 [email protected] 10-11-2014 05:40:37 Giving up on download of input_325583_2: permanent HTTP error
231 [email protected] 10-11-2014 05:40:35 Started download of input_325583_2
230 [email protected] 10-11-2014 05:40:35 Giving up on download of input_325645_2: permanent HTTP error
229 [email protected] 10-11-2014 05:40:33 Started download of input_325645_2
228 [email protected] 10-11-2014 05:40:33 Giving up on download of input_325649_1: permanent HTTP error
227 [email protected] 10-11-2014 05:40:31 Started download of input_325649_1
226 [email protected] 10-11-2014 05:40:31 Giving up on download of input_325763_2: permanent HTTP error
225 [email protected] 10-11-2014 05:40:29 Started download of input_325763_2
224 [email protected] 10-11-2014 05:40:29 Giving up on download of input_325816_4: permanent HTTP error
223 [email protected] 10-11-2014 05:40:27 Started download of input_325816_4
222 [email protected] 10-11-2014 05:40:27 Giving up on download of input_325816_7: permanent HTTP error
221 [email protected] 10-11-2014 05:40:23 Started download of input_325816_7
220 [email protected] 10-11-2014 05:40:23 Giving up on download of input_326028_2: permanent HTTP error
219 [email protected] 10-11-2014 05:40:21 Started download of input_326028_2
218 [email protected] 10-11-2014 05:40:21 Giving up on download of input_326033_5: permanent HTTP error
217 [email protected] 10-11-2014 05:40:19 Started download of input_326033_5
216 [email protected] 10-11-2014 05:40:19 Finished download of input_326060_4
215 [email protected] 10-11-2014 05:40:16 Started download of input_326060_4
214 [email protected] 10-11-2014 05:40:16 Finished download of input_326060_3
213 [email protected] 10-11-2014 05:40:14 Started download of input_326060_3
212 [email protected] 10-11-2014 05:40:14 Finished download of period_search_10112_x86_64-pc-linux-gnu__cuda55
211 [email protected] 10-11-2014 05:39:58 Started download of libcudart.so.5.5
210 [email protected] 10-11-2014 05:39:58 Started download of period_search_10112_x86_64-pc-linux-gnu__cuda55
209 [email protected] 10-11-2014 05:39:56 Scheduler request completed: got 11 new tasks
208 [email protected] 10-11-2014 05:39:00 Requesting new tasks for NVIDIA
207 [email protected] 10-11-2014 05:39:00 Sending scheduler request: To fetch work.
[...]

.


----------



## tictoc

Quote:


> Originally Posted by *DanHansenDK*
> 
> *Status "Test System 5" aka "Beaufort" - 10.11.14 05:49am:*
> System is now installed and configurated!! But, I'm still having a lot of issues on communicating with [email protected] !! Anybody who has the same problems??? Downloading jobs wouldn't work, uploading either... Can't refresh website... I've been trying to read about it online, but I can't because I'm not able to enter the website /forum!?!? This has been an issue for wuite some time now. It comes and goes, therefore I've just told myself, that it's due to a "bad" internet connection. But I'm having no problems connecting to e.g. SETI. Microseconds!!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> .


The problem is with the Asteroids server not your connection.

Looks like they have been having issues since at least the 29th of October. Here is a handy thread in the main BOINC forum that keeps track of project outages. News on Project Outages

**Edit* *It appears that asteroids just came back online while I was typing my response. Server Status


----------



## DanHansenDK

Hi TicToc








Quote:


> The problem is with the Asteroids server not your connection.
> 
> Looks like they have been having issues since at least the 29th of October. Here is a handy thread in the main BOINC forum that keeps track of project outages. News on Project Outages
> 
> *Edit* It appears that asteroids just came back online while I was typing my response. Server Status


Thanks for the great links







I didn't know about those, so thanks a lot









BTW! It's d... irritating that there's no time stamp in a normal post/thread. Only after being edited/modified the time stamp appears. It's not very user friendly









*Status "Test System 5" aka "Halifax" - 10.11.14 02:57pm:*
Change of plans







"Test System 5" aka "Halifax" is running very steady! Since "Test System 4" aka "Beaufort" has been modified into a complete "clone" of "Test System 5" aka "Halifax", the ongoing development will be transferred to this system. We'll let "Test System 5" aka "Halifax" run continuously to see how well it will keep on working.
This means that we'll solve the issues below and continue the development on "Test System 4" aka "Beaufort". It's a little upside-down, but we'll manage, right








I'm working on a website where status can be seen/checked continuously. Until that'll be done, I'll keep posting status of the running test systems accordingly.

*Status "Test System 4" aka "Beaufort" - 10.11.14 02:58pm:*
Ideas - is it a good one?:
A. We could make a script controlled by CRON, which "paused" the running jobs for, say an hour or maybe just 15 minutes!?!? To give the system a "little break". The question is if this could cause new issues. Will the system be better of running 24/7 or would it help and protect it when adding a "daily break" confused.gif
Issues still to be solved:
1. Implementation of my "WatchDog* shell-scripts" [ONGOING WORK]
2a. Ubuntu Server 14.04.1 issues (crashing&rebooting) Temp./software issues? [TESTING]
2b. Trying to: "cp /etc/X11/XF86Config /etc/X11/xorg.conf" ---> WARNING: cp: cannot stat '/etc/X11/XF86Config': No such file or directory
3a. Aerocool X-vision can't control/show CPU Fan using ASRock Z97/Z87 mobo's! All Aerocool X-vision is now dismounted. [SOLVED]
3b. Mobo Fan- & Temperature control --> CPU & GPU's x4 Papst Industrial Fans. Test & make BIOS configuration model. [ONGOING WORK]
3c. Industrial Fans "Papst" has short wires! Not using Aerocool X-vision makes this an issue! Lenghten 2 Fans wires is needed! [SOLVED]
4. 2U Industrial CPU Cooler subside bracket issue! [SOLVED]
5. System mail & warnings --> only SMTP! [ONGOING WORK]
6. Front-filter! A swap-able filter in front of the 2U chassis! Replace bolt/screw, to large. Stops the Rack chassis/from closing properly.
7. Rack-box Fan expansion!? Install another 2 large 220volts fans either in the top or in the bottom!!
8. Make a bracket for the 2U industrial PSU. (If you don't want to make one yourself, then write me and I'll send you one for free. Details later on!)
9. 4x GPU's requires 12volt connector on mobo. Find 90 degree angle connector in a store or make connector [SOLVED] [ORDERED]
10. Configure Systems to crunch using GPU's for SETI (Seti Enhanced? AstroPulse?). Non SETI standard App's does Linux/GPU's. Implementation of "Applications" --> app_config.xml / cc_config.xml
11. Network adapter issue. Ethernet names not consistent ubuntu server 14.04. # ifconfig --> "p3p1".

*Status "Test System 4" aka "Beaufort" - 10.11.14 03:08pm:*
Now I'll install "MSMTP" and setup a email account whereto alerts, warnings and notifications will be mailed. Next I'll continue the work on the shell-scripts. Finally I'll start the implementation of fan-control shell-script using LM-sensors to control & monitor the CPU/GPU fans. As you surely remember, CPu fan, GPU fans/chassis fans are all controlled by the BIOS now.

HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: Ubuntu Server 14.04.1 64Bit
KERNEL: 3.13.0-32-generic
CUDA: CUDA 6.5
NVIDIA: v.340.29
BOINC: v.7.2.42
TODO: v.1.1.6 10.11.14 01:17:00

Currently working on:
MSMTP - send email "system notifications", "warnings" & "alerts".
Mobo Fan- & Temperature control --> CPU & GPU's x4 Papst Industrial Fans. Test & make BIOS configuration model.
Implementation of my "WatchDog* shell-scripts"

Let's do this









MSMTP - send email "system notifications", "warnings" & "alerts".
Here's how we are going to do this. We are going to install a nice little piece of software called "MSMTP"








You can install MSMTP by following this little ToDo. Afterwards we'll setup a mail account. For some time I had problems getting it to work with Gmail, but I solved it. Follow these steps:

Code:



Code:


1 Install MSMTP and certificates. Configure MSMTP to email System Notifications, Warnings & Alerts:
# apt-get install msmtp msmtp-mta ca-certificates

2 Now its time to configure MSMTP. Let's make a configuration file:
# vi /etc/msmtprc

Make the file look like this. To send system email trough another mail server:
File:
Insert content from the sample below: Content of /etc/msmtprc - Set up mail accounts in MSMTP

3 Now its time to test that e-mail works! First, we will run a "pretend test" to make sure the settings we have just configured have taken affect. Verify the output is correct as best as you know it to be:
# msmtp --pretend

4 Now, we will try an actual test. Modify "[email protected]". If all goes well, you should receive an e-mail very shortly:
# echo "This is a test e-mail from my server using msmtp" | msmtp -d [email protected]

"Content of /etc/msmtprc - Set up mail accounts in MSMTP"
_Please notice that when sending mail using your Gmail account, you may need to choose lower security. Just google "google account lower security"._

Code:



Code:


# ------------------------------------------------------------------------------
# Accounts
# ------------------------------------------------------------------------------

## ATTEMT TO MAKE GMAIL SMTP WORK ON GMAIL ACCOUNT WITH LESS SECURITY WORKS!!
## MAILSERVER FOR SENDING SYSTEM EMAIL'S
## [email protected] account
account gmail
host smtp.gmail.com
port 587
# from [email protected]
tls on
tls_starttls on
tls_trust_file /etc/ssl/certs/ca-certificates.crt
auth on
user [email protected]
password ********
syslog LOG_MAIL

# ------------------------------------------------------------------------------
# MSMTP System Wide Configuration file
# ------------------------------------------------------------------------------

# A system wide configuration is optional.
# If it exists, it usually defines a default account.
# This allows msmtp to be used like /usr/sbin/sendmail.

# ------------------------------------------------------------------------------
# Sample accounts
# ------------------------------------------------------------------------------

## Some ISP account
# account someisp
# host mail.someisp.tld
# port 587
# from [email protected]
# auth login
# user [email protected]
# password ********
# syslog LOG_MAIL

## Gmail account
## Configuring a Gmail account may require lower security for the Gmail account
## Googling for "gmail msmtp" & "Gmail account lower security" should help
# account gmail
# host smtp.gmail.com
# port 587
# from [email protected]
# tls on
# tls_starttls on
# tls_trust_file /etc/ssl/certs/ca-certificates.crt
## CAN BE THESE CERTIFICATES INSTEAD
## tls_trust_file /usr/share/ca-certificates/mozilla/Equifax_Secure_CA.crt
# auth on
# user [email protected]
# password ********
# syslog LOG_MAIL
## CAN BE THIS LOG FILE INSTEAD
## logfile /var/log/msmtp.log

# Other ISP Account
# Configuring for other ISPs is beyond the scope of this tutorial
# Googling for "myisp outlook smtp" should help

# ------------------------------------------------------------------------------
# Configurations
# ------------------------------------------------------------------------------

# Construct envelope-from addresses of the form "[email protected]".
# auto_from on
# maildomain fermmy.server

# Use TLS.
# tls on
# tls_trust_file /etc/ssl/certs/ca-certificates.crt

# Syslog logging with facility LOG_MAIL instead of the default LOG_USER.
# Must be done within "account" sub-section above
# syslog LOG_MAIL

# Set a default account
account default : gmail

.


----------



## DanHansenDK

*Status "Test System 4" aka "Beaufort" - 11.11.14 06:02am:*
System 24 hour test with "open hood" is done! No problems! This just keeps getting better and better







Now we'll close the chassis and watch it closely. It's always GPU 4/graphic card in PCI-e port 4 [0000:0C:00.0] on this system, which is the one with the "issues" . You may remember the troubles from back then, back when we took GPU "babysteps" and added 1 card at a time. Happy days...
I'll continue with the installation of MSMTP in the post before this one in a little while. about an hour or so









Status 24 hour test - "open hood" - 24 hours:

Code:



Code:


# uptime
 05:55:56 up 1 day,  1:54,  1 user,  load average: 4.66, 4.80, 4.76

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +58.0°C  (high = +80.0°C, crit = +100.0°C)

*Status "Test System 4" aka "Beaufort" - 11.11.14 06:21am:*
System 24 hour burn-in test with "closed hood" has been launched!

Status 24 hour test - "hood down" - 0 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +61.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +58.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 5 minutes:
Actually it seems to better with the "hood down". I'll guess the flow of air is better! I did notice that a lot of air just went upwards when having the hood up. It seems that more air is being "pressed" by GPU 4 (The graphic card next to the 2U Industrial PSU which ran too hot before we improved those 4 chassis fans). CPU is getting a little hotter, but we haven't implemented the "custom" model/plan for mobo chassis & CPU fans yet. They all run according to the standard settings on the mobo.

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +57.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 10 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +61.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +61.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +57.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 15 minutes:
Whenever using a BIOS "standard" set-up for the CPU and chassis fans, you'll get this unstable system going up and down, fans changing RPM all the time. Yes, I know there's different job units and these will change the intensity of work mode. But, when a system like this is being controlled by a "standard" setting, it'll keep changing fan RPM. This because of the onboard temperature sensors. They'll spin according to the temperature setting e.g. 40-50 degrees Celsius = 80%, 50-60 degrees Celsius 90% etc. etc. This way the environment will keep on changing. The temperatures will keep increasing and the fans RPM accordingly, this will cool of the environment a little and then the fan RPM will decrease again. Up and down all day long... No, we'll make a plan/setting which makes this system chassis fan run at a steady 80% all the time, keeping the temperature down and set a security limit for the fans to increase to a 100%. And then we'll set out "Watchdog* Shell-Scripts" to guard the system. That's the way forward









Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +57.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 30 minutes:
As you can see, the temperatures gone up again and the fans are right now at a lower rate of RPM. It's great with the ongoing develpment of mobo, of course! But these "standard" settings controlled by onboard sensors, I just hate it... well, just a little anyway









Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +62.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +62.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +58.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 1 hour:
And temperatures are down once again.. Really looking forward to implementing the "Watchdog*"









Code:



Code:


# uptime
 07:33:58 up 1 day,  3:32,  1 user,  load average: 4.64, 4.65, 4.72

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +54.0°C  (high = +80.0°C, crit = +100.0°C)

Status 24 hour test - "hood down" - 24+ hours:
Test complete! No problems, but GPU 4 temperature is a little high I think. I decided to keep a limit under 50 degrees Celsius, so it's a little high








I'll go on with the shell-scripts and the custom BIOS fan control setting/plan









Code:



Code:


# uptime
 01:06:58 up 2 days, 21:05,  1 user,  load average: 4.65, 4.64, 4.67

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +54.0°C  (high = +80.0°C, crit = +100.0°C)

.


----------



## DanHansenDK

*Status "Test System 4" aka "Beaufort" - 13.11.14 01:32am:*
Well, we'll have to find the right custom setting/plan for those 4 Papst chassis fans! It's they who move all the accumulating heat that gathers in the chassis/case. If this heat isn't removed, the heat builds up. The GPU fans can't cool themselves enough! Not with the lack of air inside this "compact environment". Let's run a combined test of the CPU fan and the 4 chassis fans and see how things develops









Status Custom Setting/Plan for CPU/Chassis Fan's - Setup1/Test 1:
Heat scale: Celsius
CPU/GPU's workload: 100%
CPU Cooler/fan RPM: 2U Industrial 10.000+
Chassis fans RPM: Papst Industrial 3.500+

Code:



Code:


CPU Config.:
50° --> 70%
60° --> 70%
70° --> 80%
75° --> 80%
80° --> max.

Chassis/GPU's
Chassis fan1: --> 100%
Chassis fan2: --> 100%
Chassis fan3: --> 100%
Chassis fan4: --> 100%

Test1 - Result after 15 minutes:

Code:



Code:


# uptime
 01:48:50 up 18 min,  1 user,  load average: 4.82, 4.60, 3.23

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +52.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 1+ hour:

Code:



Code:


# uptime
 03:07:04 up  1:36,  1 user,  load average: 5.06, 4.83, 4.74

CPU seems nice and cool. I don't think the data from a short period like this is enough to find facts. Lets make a 12 hour test and see the data. I'll make a little script that'll log the result every second hour or so.
I like the idea of hardware running at 60-70%! Not being max'ed out... And some tests did show that there's pretty much no difference between this CPU fan running at 80% v. 100%. That's in a "empty" environment without 4 GPU's at 100%







I get that the GPU's and the CPU is better of running at maximum. Instead of increasing and decreasing the whole time! That's what I believe. But it calls for serious cooling. No doubt about it. I guess we'll land in an area around 70-75% for the CPU fan and 80-90% for the chassis fans. Still with something to give should a fan "brake" or crash, a floret/faulty workunit/job occur or just maybe a very hot summer day. These systems are semi-professional systems which isn't mounted in refrigerated compartments, yet







Just RACK's with large coolers in the top.
Sorry if my language is getting "cryptic", it's late, or should I say "early". See you in the "morning"









Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +49.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 2+ hour:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +52.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 3+ hour:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 43 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 4+ hour:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +52.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 5+ hour:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +52.0°C  (high = +80.0°C, crit = +100.0°C)

Test1 - Result after 12+ hour:

Code:



Code:


# uptime
 15:25:53 up 13:55,  1 user,  load average: 5.02, 4.77, 4.74

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

_test ended...._


----------



## DanHansenDK

(/&%¤#¤%&/()=

Sorry, it looks like we are not going to get any use of our WatchdogFanControl.sh scipt !! I just had to try and install it before stopping for the day.. This is what I got:

Code:



Code:


# apt-get install fancontrol

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  fancontrol
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 20.2 kB of archives.
After this operation, 131 kB of additional disk space will be used.
Get:1 http://dk.archive.ubuntu.com/ubuntu/ trusty/universe fancontrol all 1:3.3.4-2ubuntu1 [20.2 kB]
Fetched 20.2 kB in 0s (129 kB/s)
Selecting previously unselected package fancontrol.
(Reading database ... 105723 files and directories currently installed.)
Preparing to unpack .../fancontrol_1%3a3.3.4-2ubuntu1_all.deb ...
Unpacking fancontrol (1:3.3.4-2ubuntu1) ...
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
Processing triggers for ureadahead (0.100.0-16) ...
ureadahead will be reprofiled on next reboot
Setting up fancontrol (1:3.3.4-2ubuntu1) ...
 * Not starting fancontrol; run pwmconfig first.
Processing triggers for ureadahead (0.100.0-16) ...

Code:



Code:


# pwmconfig

# pwmconfig revision 6166 (2013-05-01)
This program will search your sensors for pulse width modulation (pwm)
controls, and test each one to see if it controls a fan on
your motherboard. Note that many motherboards do not have pwm
circuitry installed, even if your sensor chip supports pwm.

We will attempt to briefly stop each fan using the pwm controls.
The program will attempt to restore each fan to full speed
after testing. However, it is ** very important ** that you
physically verify that the fans have been to full speed
after the program has completed.

/usr/sbin/pwmconfig: There are no pwm-capable sensor modules installed

/usr/sbin/pwmconfig: There are no pwm-capable sensor modules installed

This means no software controlled fans... Not just yet anyway ;( Well, the most important is the CPU heat and the GPU heat! And those work just fine









.


----------



## DanHansenDK

*Status "Test System 4" aka "Beaufort" - 13.11.14 03:34pm:*
Test1 is done! Lets change the setup and try running chassis fans at 80%. Why not 70% or even 60%?? There's nothing to be gained replacing the original chassis fans with industrial fans, if you'll run them at 60%. Then we might as well keep the original fans and run them at 100%. Let's try and see what happens if we'll run them at 80%. I'll make a suggestion to af configuration. If you have an idea to a setup, please let me know. I know you are quite good at all this in here









Status Custom Setting/Plan for CPU/Chassis Fan's - Setup2/Test 1:
Same GPU config. Custom chassis fan/GPU's setup. Let's keep it a combined test for the 4 chassis fans. It's a little early to change them individually I think.
The GPU's temperature is "circling" in the area of 45 - 50 degrees Celsius. Therefore I think it would be a bad idea to set a limit for RPM change close to these values. If we did so, they would just increase and decrease all the time. Just like the CPU fan does using the standard setup!! And, it's a little annoying that as soon as the Nvidia driver is installed/Nouveau driver destroyed, all data from the LM-sensor scripts is missing! All we got is the 4 CPU cores. OK, we can work with that, but it would have been nice to be able to see the fan RPM and the onboard thermal sensors output!! I have to enter BIOS all the time to check the onboard sensors output/temperatures. If anyone has found a way around this issue I'll like to hear about it. Please remember we are in a CLI environment and with a Nvidia/CUDA debian package installed. We'll try this setup:
_I'll go on with the installation of MDMTP & the Watchdog* Shell-Scripts down below (next post)_

Heat scale: Celsius
CPU/GPU's workload: 100%
CPU Cooler/fan RPM: 2U Industrial 10.000+
Chassis fans RPM: Papst Industrial 3.500+

Code:



Code:


CPU Config.:
50° --> 70%
60° --> 70%
70° --> 80%
75° --> 80%
80° --> max.

Chassis/GPU's 1-4 config.:
30° --> 70%
40° --> 80%
50° --> 80%
55° --> 80%
60° --> max.

Setup2/Test1 - Result after 5 minutes:
_test stopped...._

I'll try something new. There's to much variation on the onboard thermal sensors. We'll have to make a secure system here! I'll use some more time on figuring this out even though I rather finish up those shell-scripts







OK...

A Boinc working CPU has got 4 level in this hypothesis.:
Level1: temp. idle
Level2: temp. 5 sec. after boinc-client has been stopped
Level3: temp. min. at 100% work load
Level4: temp. max. at 100% work load

Chassis fans/mobo onboard thermal sensors 1-7 (8 has gone loopy):
Level1: temp. idle
Level2: temp. 5 sec after GPU's has been stopped
Level3: can't access it
Level4: can't access it

CPU fan/Levels results:
Level1: 34-37° --> 35°
Level2: 45-47° --> 45°
Level3: 53-56° --> 55°
Level4: 59-62° --> 60°

_Internal notes:_
----
34-37 35° : 0
----
45-47 45° : 1
----
00-00 55° : 2
00-00 60° : 3
----

Based on these numbers, let's see if we can set some limits in a smart way.

CPU fan/Levels config.:
0° - 39° --> 70% (so that if the boinc-client is idle it'll decrease a little)
40° - 69° --> 80% (operation speed. this is within the limit's of a CPU at 100% +/- 10°)
70° - 79° --> 90% (in case of a special work unit/job like .wu or a hot summer day, it'll increase a little)
80°+ --> (CPU limit reached. maximum speed in case of CPU overheating)

This makes a CPU Config. looking like this. I already set it up in BIOS:

Code:



Code:


CPU Config.:
30° --> 70%
40° --> 80%
50° --> 80%
70° --> 90%
80° --> max.

Chassis fans/GPU levels results:
Level1: 30-33° --> 30°
Level2: 33-39° --> 35°

_Internal notes:_
----
30-33 30° : 0
33-39 35° : 1
----
00-00 00° : 2 <---- missing data
00-00 00° : 3 <---- missing data
----

Based on these numbers, let's see if we can make some reasonable limits. We haven't got reliable data/temperatures from the mobo onboard thermal sensors on a system working at 100%. We've got plenty of data from the GPU's (nvidia-smi), but this is not what we need to make this right!. So we are missing some data, but let's try it anyway.
Another thing is the way the BIOS program has been designed. I know that it's a pretty d... hard market! But, when you let a user choose the source from which the fans can be controlled, in this case the M/B, then please change temp. limits and the describing texts accordingly!!! When I chose M/B as the source for the fans to be controlled, the text still reads "CPU temp." and the limits aren't able to set lower than 30 degrees Celsius. And the lower limit for the M/B to activate 100% fan is 60 degrees Celsius. I rather not have 60 degress anywhere on my mobo at any time, that's for sure!! When changing the source from CPU to M/B the heat-scale should be changed accordingly of course








This means we haven't got a lower setting than 30 degrees. I would have made a "idle" setting so that the chassis fans could have reached a state of "rest" if the system was idle. But thanks to the brilliant scale on this system, this is not going to happen. In stead we're going to make it real simple. There's not much else to do anyway... And the lowest setting for 100% fan has to be 60 degrees, because it's the lowest of the scale









Chassis fans/GPU config.:
This is how I would have liked it to be. Even though we are missing data, this would have been much better. OK.. We'll manage! In the end all that matter is, that the fans doesn't have to run at max speed all the time








0° - 24° --> 70% (so that if the boinc-client is idle it'll decrease a little)
25° - 59° --> 80% (operation speed. this should be within the limit's of the GPU's at 100% --> the thermal onboard sensors)
60°+ --> max% ()

Code:



Code:


Chassis/GPU 1-4 config.:
30° --> 70%
40° --> 80%
50° --> 80%
55° --> 80%
60° --> max.

*Status "Test System 4" aka "Beaufort" - 14.11.14 04:46am:*
OK, configuration is done. Let's launch another test and then see how it goes. During that we'll go on with MSMTP and ShellScripts







:

*Status "Test System 5" aka "Halifax" - 14.11.14 04:46am:*
Just runs like a charm... It really works well I must say, knock on wood









Code:



Code:


# uptime
 04:55:31 up 11 days, 12:29,  1 user,  load average: 5.60, 5.07, 4.90

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

°


----------



## Tex1954

Wow! You are really trucking along!


----------



## DanHansenDK

Hi Tex









Yes... I think it's time to make some kind of a ToDo which can be used now! So I'm trying to fix these issues regarding the shell-scripts, mobo BIOS fan settings etc. etc. Thanks for looking in on me









BTW. I have had a great day a the tracks







I know it's because a lot of jobs has accumulated, but anyway, it's great to experience, right








I'm looking forward to building the 3 system using 4 GPU's. My plan is to have 5 running, but I have to take baby steps. It's costly toys...

Check this out.
http://boincstats.com/en/stats/-1/user/detail/2560253/lastDays

I was guessing on about 15.000 points per system per night. It may be a little better than that, but sadly not as much as for this night/day









Most of all I hope somebody want's to build a system like it. I am using hardware which can be found all over the world. Even the Rack is a standard. Both in design and in size









2 systems running 4x GPU's and 2 systems running 1x GPU:
I'm especially fond of the SETI job done by one of the GPU's 6min 29secs. Happy days man







OK, those jobs aren't that large, but anyway, it's fun to watch. Back in the 90' sometimes it took more than a day to do a job, if I remember correctly










Lot's of nice SETI jobs in line, waiting for the GPU's










OK, back to work... Continuing in the post above


----------



## Tex1954

Very Nice!

I may have to borrow your knowledge to make server run on a system in the future, we will see.

But certainly all you are doing can also be mentioned in the Linux forum as well! ALL good stuff!


----------



## Finrond

you should join Team Overclock.net! (This can be done from the project websites)


----------



## DanHansenDK

Hi Tex,
Quote:


> I may have to borrow your knowledge to make server run on a system in the future, we will see.


Certainly... Just say the word my friend









Hi Finrond,
you should join Team Overclock.net! (This can be done from the project websites)
Thanks.. I'll keep it in mind. Never did join anything, mostly because I didn't understand the reason for joining such a "club"







For now I'm just little me but lets se when I've got the first 5 HLCLIMGPUBS'ers. I certainly know where I feel "at home"









Sorry for the delayment... Planned an hour of relaxation. This was before I'll stumbled right in to a program in TV. A tribute to Leonard Cohen... With a lot of the better musicians in the world (I think), Nick Cave, Rufus & Martha Wainwright, Julie Crhistensen, Perla Batalla etc. etc. I'm telling you... This was an experience of the few.. A trip to the edge of reason.. Oh My God!!!! Here's one if you like more than just pop







https://www.youtube.com/watch?v=b4bYDxbVIKE

OK... Back to work...


----------



## Tex1954

Wow, I know the feeling... I get lost on you tube performances sometimes at home... like going to a concert...

Enjoy!


----------



## DanHansenDK

*Status "Test System 5" aka "Halifax" - 15.11.14 00:48am:*
Control Unit --> "Test System 4" aka "Beaufort" running the config. below.

Note's:
OK!!! I'm a complete imbecile! Found the reason for the increasing GPU heat! Oh my G.., it's so easy! Don't know what I'm been thinking!
This system is running BIOS "standard config." and when the CPU temperature decreases to less than e.g. 40 degrees the chassis fan decrease in RPM accordingly! Why? Because the mobo thermal sensors in standard config. in the BIOS is set to monitor the CPU temperatures, not the M/B








Might as well be asleep!!







Not doing much better anyway









_# Internal Data Note:_

Status - Alert! Watch Development:
Activate BIOS control setup if it gets worse! Use config. --> Beaufort "Config2"

CPU fan/Levels results:
Level1: 34-37° --> 35°
Level2: 45-47° --> 45°
Level3: 53-56° --> 55°
Level4: 59-62° --> 60°

_Internal notes:_
----
34-37 35° : 0
----
45-47 45° : 1
----
00-00 55° : 2
00-00 60° : 3
----

Based on these numbers, let's see if we can set some limits in a smart way.

CPU fan/Levels config.:
0° - 39° --> 70% (so that if the boinc-client is idle it'll decrease a little)
40° - 69° --> 80% (operation speed. this is within the limit's of a CPU at 100% +/- 10°)
70° - 79° --> 90% (in case of a special work unit/job like .wu or a hot summer day, it'll increase a little)
80°+ --> (CPU limit reached. maximum speed in case of CPU overheating)

This makes a CPU Config. looking like this. I already set it up in BIOS:

Code:



Code:


CPU Config.:
30° --> 70%
40° --> 80%
50° --> 80%
70° --> 90%
80° --> max.

Code:



Code:


# uptime
 00:46:48 up 12 days,  8:20,  1 user,  load average: 4.36, 4.47, 4.61

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# uptime
 02:01:12 up 12 days,  9:35,  1 user,  load average: 4.91, 4.85, 4.79

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +47.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# uptime
 02:10:52 up 12 days,  9:44,  1 user,  load average: 4.67, 4.75, 4.75

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +46.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# uptime
 02:30:14 up 12 days, 10:04,  1 user,  load average: 4.44, 4.63, 4.72

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +47.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# uptime
 02:35:48 up 12 days, 10:09,  1 user,  load average: 4.55, 4.60, 4.68

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +46.0°C  (high = +80.0°C, crit = +100.0°C)

Status - Alert! Watch Development:
Activate BIOS control setup if it gets worse! Use config. --> Beaufort "Config2"
Look like we'll have to do exactly that.. I'll let it run a little while longer just to see what happens. It's a easy fix anyway









Code:



Code:


# uptime
 10:18:46 up 12 days, 17:52,  1 user,  load average: 4.94, 4.85, 4.73

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 51 C  <---------- +2 DEGREES OVER LIMIT
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +50.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +80.0°C, crit = +100.0°C)

Code:



Code:


# uptime
 15:17:05 up 12 days, 22:51,  1 user,  load average: 4.41, 4.58, 4.65

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 51 C  <---------- +2 DEGREES OVER LIMIT
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +51.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +49.0°C  (high = +80.0°C, crit = +100.0°C)

*Status "Test System 5" aka "Halifax" - 16.11.14 10:39am:*
Control Unit --> "Test System 4" aka "Beaufort" running the config. below.

Status - Alert! Watch Development:
"Halifax" Using standard BIOS config.!

Code:



Code:


# uptime
 10:41:10 up 13 days, 18:15,  1 user,  load average: 4.76, 4.68, 4.67

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

Status - Alert! Watch Development:
"Beaufort" Using BIOS config.2 from Test2!

Code:



Code:


# uptime
 10:36:24 up 2 days,  6:12,  1 user,  load average: 4.91, 4.80, 4.78

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

#

Code:



Code:


sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +54.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +53.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +52.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +51.0°C  (high = +80.0°C, crit = +100.0°C)

_# END_

.


----------



## DanHansenDK

Hi Tex







Quote:


> Wow, I know the feeling... I get lost on you tube performances sometimes at home... like going to a concert...


Thanks my friend... Just a wonderful experience...


----------



## DanHansenDK

Calling for JetPak12









Hi JetPak12,
Quote:


> I believe you have what is called the GT640 "Rev 2". On top of having GDDR5 in place of GDDR3, it also has a cut down core with fewer ROPs and texture mappers, but the shader count remains the same as the old GT640. It also got a decent stock clock boost, which should work out to better compute performance at less watts. thumb.gif
> I have the Gigabyte version of the same card running in the BGB right now and it has been staying right at 59C for the past two days with 99% GPU usage and fan settings on auto. It also has an 8800GT sandwiched right on top of it running at full bore too, and I think temps on that guy are in the 70s (and the 8800GT fan is so much louder! rolleyes.gif)


Regarding this:
_"....I have the Gigabyte version of the same card running in the BGB right now and it has been staying right at 59C for the past two days with 99% GPU usage and fan settings on auto..."_

How does it run? No problems yet? Because I've had a card crashed and replaced because of a hardware failure!!! I know that it's able to endure 95 degrees heat (Celsius, right? I hope it is) In my case the vendor is Asus and if I'm remembering correctly, it should be able to endure 95 degrees Celsius. Hope it's not Fahrenheit and that I've been a complete fool. Don't know the scale so I'm not aware of the diff.


----------



## Tex1954

Not to worry! Just about any modern GPU can handle 80-C 24/7 and the failed one you had is just production fallout.

Looks perfect temperature wise to me! My GTX 560 Ti cards run 58-60C 24/7 water cooled without a glitch for 3 years now... still kicking! My water cooled GTX 460's OC to 900MHz and ran about the same, but those are retired now... maybe put the fans back on them and give away as a prize too...eventually....someday....

LOL!


----------



## jetpak12

Quote:


> Originally Posted by *DanHansenDK*
> 
> Calling for JetPak12
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Hi JetPak12,
> 
> Regarding this:
> _"....I have the Gigabyte version of the same card running in the BGB right now and it has been staying right at 59C for the past two days with 99% GPU usage and fan settings on auto..."_
> 
> How does it run? No problems yet? Because I've had a card crashed and replaced because of a hardware failure!!! I know that it's able to endure 95 degrees heat (Celsius, right? I hope it is) In my case the vendor is Asus and if I'm remembering correctly, it should be able to endure 95 degrees Celsius. Hope it's not Fahrenheit and that I've been a complete fool. Don't know the scale so I'm not aware of the diff.


I don't run my card like you do in continuous load, but I have not had any problems with mine. My main GPU is now a 290X so I'll probably have it running more often now on BOINC in the secondary slot.

But yes, my temps were in Celsius, and in some cases its run up to 90C without any problems. The GPU's fan has been sufficient on its own to keep it below 90C.


----------



## DanHansenDK

Hi JetPak12








Quote:


> "....But yes, my temps were in Celsius, and in some cases its run up to 90C without any problems. The GPU's fan has been sufficient on its own to keep it below 90C..."


Thanks my friend. Then I'll relax just a little







The reason I'm all over this is because of the card that failed of course. But besides that I once noticed the card heatsink to be very very hot. More than the 55-58 degrees "nvidia-smi" reported. A least that's what I think! I didn't measure it. And I've learned not to trust those thermal sensors to much. But when you are telling me you ran a card up to 90 degrees Celsius or close to this, then I'm not that worried any more









Hi Tex








Quote:


> "....Not to worry! Just about any modern GPU can handle 80-C 24/7 and the failed one you had is just production fallout.
> Looks perfect temperature wise to me! My GTX 560 Ti cards run 58-60C 24/7 water cooled without a glitch for 3 years now... still kicking! My water cooled GTX 460's OC to 900MHz and ran about the same, but those are retired now..."


OK, thanks... I've got a GTX770 using standard fan's and this card isn't getting past 60 degrees I think it is (as the image below illustrates)







Actually I've been thinking about installing 2 more. Just to see how much a "normal" desktop can do. The mobo I'm using on this box is a Asus Maximus Extreme VI with 5 PCI-e slots, 4 standard GPU's/cards og 3 "double" cards or what it's called.
I'm testing and learning about these other cards for "Headless Linux CLI Multiple GPU Boinc Server" markII. As we talked about a little while back it just has to be low profile cards, so I'm thinking about using watercooling for this version. But much can change between now and then anyway







The mobo I'm using right now is according to ASRock coated and more or less waterproof. So I'm keeping it in mind all the time, that I may have to use watercooling.
Quote:


> "....maybe put the fans back on them and give away as a prize too...eventually....someday..."


Please explain. Sorry











Thanks for helping me...
Have a nice weekend


----------



## Tex1954

Quote:


> Quote:
> "....maybe put the fans back on them and give away as a prize too...eventually....someday..."
> 
> Please explain. Sorry redface.gif


Well, when one water cools the GPU's, the stock fans/heatsink come off and the GPU water block goes on... I meant that I would need to remove the water block and re-install the original coolers.

Then I could give them away as a BGB prize or sell them for $15 on ebay or something...


----------



## DanHansenDK

Hi Tex,

OK... I see.. Sorry for asking








I've been studying those water units/blocks but haven't been able to find a site with blocks for e.g. Asus GT640 Low Profile !! Found a lot of different models listed. Thousands... But not any that fitted my cards.. As I mumbled about earlier on I'm thinking about using watercooling for the next model.
Something nice has happened in the Kingdom of Denmark... A shipping business which does nothing but handling delivering of goods from The States to Denmark. So now it's possible to order anything from over there. Well, if the advert is to be believed anyway..









*Status "Test System 5" aka "Halifax" - 17.11.14 02:57pm:*
Control unit for "Test System 4" aka "Beaufort".
"Beaufort" is using BIOS config.2 from Test2! Is still looking good. CPU at 60 degrees but is till 10 degrees from the limit (limit for 90% fan speed). Seems to be a great config.!!

Code:



Code:


# uptime
 14:51:03 up 3 days, 10:27,  1 user,  load average: 5.11, 4.84, 4.77

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 47 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +57.0°C  (high = +80.0°C, crit = +100.0°C)

Status - Alert! Watch Development:
"Halifax" Using standard BIOS config.!
Has gone down a little, but is still high. Higher than the exact same machine "Beaufort" anyway. Not that hot compared to the critical limits for GPU's and the CPU. Let's go on with MSMTP config and the Shell-Scripts.. I hope they'll run without any further issues for a few days, so that we can get those scripts implemented








OK, it's running a little hot becase of the BIOS config./fan setup. But it's been running for 14 days without any errors now and it really seems to be a good and stable system now. We are getting close now!! Let's get them scripts done and get some additional security. Security in terms of knowledge, knowledge that we'll be alerted if something happens on the system









Code:



Code:


# uptime
 14:51:43 up 14 days, 22:25,  1 user,  load average: 4.87, 4.77, 4.72

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +55.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

Status - Alert! Watch Development:
"Halifax" Using standard BIOS config.!
Running very nicely! This is without any doubt the longest we've had a 4 GPU system running at 100% 24/7 without any failures/errors!! The reason I notified about the "high temperature was only because of the limit we set earlier on. The systems looks to be running perfectly. When switching to the same BIOS Fan Control Plan as Test System 5 aka Beaufort, it'll only be better







I think we'' end the test here and go on with the shell scripts









_BTW I'm introducing Test System 6 aka "Whitley" which will be the same mobo and of course the same industrial Papst Fans, 2U PSU and 2U CPU cooler, but I'm thinking of testing an i7, just to see if I'm right in my assumption from earlier on. That the i7 wouldn't make that much of a difference in this "data chewing" case. What do you think??_

Code:



Code:


n# uptime
 08:58:01 up 18 days, 16:31,  1 user,  load average: 4.82, 4.68, 4.63

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +53.0°C  (high = +80.0°C, crit = +100.0°C)

.


----------



## Tex1954

Generally speaking, there is no need to water cool lower end cards because the small fans work fine on low power stuff. Most water blocks address the high power higher end GPU's...

In any case, I re-read this entire thread and you are kick'n it!

I love this stuff!










PS: A bit chilly outside today....


----------



## gamer11200

As a Canadian, looks like a normal day there Tex.


----------



## Tex1954

Quote:


> Originally Posted by *gamer11200*
> 
> As a Canadian, looks like a normal day there Tex.


LOL!

I suppose it would look normal to you... you certainly have more of that white stuff than I do!


----------



## tictoc

Not Canadian, but that looks like the next 4 months for me too.









Speaking of a bit chilly:



That's -27ºC for our Canadian friends.


----------



## Tex1954

Geez that is cold...!!!


----------



## Finrond

Ugh, My +15-20 is starting to look balmy by comparison.


----------



## gamer11200

Quote:


> Originally Posted by *Finrond*
> 
> Ugh, My +15-20 is starting to look balmy by comparison.


Bottle it up and send it to us


----------



## DanHansenDK

Hi Guys!!!

Thanks for the pictures, I love it







Nice to see where in the world you are and how it looks there







Thanks!!

BTW please ship some of the white stuff over here... We've got none in a while







15 years ago it was white every winter. We were able to touch the phone lines or what it was back then







It's getting winter here as well, but nothing like that. Not yet anyway.. Ohh I'm dreaming. Love skiing.. Got a new pair a couple of years ago and haven't had the chance of trying them out yet...

TicToc, I would very much like to see that cabin from a little further of, it looks nice man









Hi gamer... You Canadians are the luckiest people in the world! That kind of nature and clean environment. I do not envy you, not at all









A few questions!! Please let me know what you guys think









From the last post:
1. BTW I'm introducing Test System 6 aka "Whitley" which will be the same mobo and of course the same industrial Papst Fans, 2U PSU and 2U CPU cooler, but I'm thinking of testing an i7, just to see if I'm right in my assumption from earlier on. That the i7 wouldn't make that much of a difference in this "data chewing" case. What do you think?? Should we try such a CPU?? Or do you agree in my assumption?

2. Regarding the ShellScripts, Based on the test results from Test System 4 & 5, what limits would you make? We need 3 places for the system to be in. That means to limits, one limit where you'll be warned about the system getting hot, and a limit for the "system critical heat". Not the same limit as e.g. the Intel CPU has (80 degrees Celsius in most cases) but a limit where we want the system to shut down and wait for maintenance








How these mail warnings, alerts and logging works, you can read about a bit back in the thread. Shortly described, the system logs temperatures for GPU's and all CPU cores no matter how many you've got, if the limit is being crossed. A mail warning will be mailed to you as well. If the same units is getting even hotter, the script will alert you by sending a mail, log it and then shut-down the whole system right away!!! Therefore we need 2 limits..
When logging these temperatures, all temperatures will be logged of course. Or al least that's what I'm working on. So that we can see if e.g. some of the other GPU's were running hot as well or this was only the one GPU which ran a little hot. This often is a strong indicator of the problem... Time and date of incidents wil logged as well of course







Actually, this tool can be used in a test version as well. Like a sort of debugger









Limit for warning mail and logging. Sample message:
[Warning! GPU 0000:04:00 is running a little hot and has crossed X'degrees. Other GPU's were under limit. CPU Core's temperatures were under limit. Incident is logged. System will keep on running]

Limit for critical alert mail, logging and system shut-down. Sample message:
[Critical Alert! CPU Core 2: has reached a critical temperature of X'degrees. Other CPU Core's were under limit. GPU's were under critical limit. Incident is logged. System will now shutdown

So suggestions please.. If you don't mind








As mentioned above, please keep in mind that it's properly best to set a critical limit below the 80 degrees. Or that's my suggestion anyway. I don't like systems to be that hot. I'll guess you figures that out light-years ago








_For now temperatures is only in Celsius_

Limit for warning mail and logging:
Limit for critical alert mail, logging and system shut-down:

3. If you know where to find a ASUS version of this card and where to fin a good place to compare cards. I once had an address for a great site where important stuff like "Endures temperatures up to: " were to be found!! But I can't find it.

This card I need in a Asus version (I think) for Test System 7* still in Low Profile! (heat enduring qualities required)
GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 x16 HDCP Ready Video Card
Sample Vendor: http://skinflint.co.uk/kfa-galaxy-geforce-gtx-750-ti-oc-low-profile-751gh8hx9kxz-75igh8hx9kxz-75igh8hx9kxx-a1110002.html
Specifications: http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications


And this if that combination exist. Else from the best vendor you know. (heat enduring qualities required)
Radeon R7 250 1GB 128-Bit GDDR5 PCI Express 3.0 SFF Video Card
Sample Vendor: http://www.sapphiretech.com/presentation/product/?cid=1&gid=3&sgid=1226&pid=2149&lid=1
Specifications: _haven't found any_


I found this site where you can see a sample of the diff between the 2 cards in action. In this case bitcoins.:
http://www.tomshardware.com/reviews/geforce-gtx-750-ti-review,3750-17.html

.


----------



## scubadiver59

Quote:


> Originally Posted by *DanHansenDK*
> 
> Links to the steps, which is not going to be changed. The way we install the fans, makes room for air (breathing holes) etc. etc.:
> http://www.overclock.net/t/1467918/ubuntu-server-12-04-4-64bit-boinc-using-gpu-from-geforce-gt610-to-crunch-data/90#post_22734366
> http://www.overclock.net/t/1467918/ubuntu-server-12-04-4-64bit-boinc-using-gpu-from-geforce-gt610-to-crunch-data/90#post_22741815


Coming in late to the game, and really impressed with what you're doing with your systems!!!

One thing I'm surprised about is that it took you so long to take out those blank card slots and to remove the connection cover surrounding the motherboard connectors on the back of the case.

All you were doing before was eventually warming all the air up and not having anywhere for it to exit except through those small holes in the blank slot covers and that small rectangular perforated area above the motherboard connectors: all you were doing was swirling the air around in the case before it could finally work its way out those few exit points.

I'm sure that by leaving out those covers that you dropped a few degrees since the hotter air now had somewhere to go...fast.


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> 3. If you know where to find a ASUS version of this card and where to fin a good place to compare cards. I once had an address for a great site where important stuff like "Endures temperatures up to: " were to be found!! But I can't find it.
> 
> This card I need in a Asus version (I think) for Test System 7* still in Low Profile! (heat enduring qualities required)
> GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 x16 HDCP Ready Video Card
> Sample Vendor: http://skinflint.co.uk/kfa-galaxy-geforce-gtx-750-ti-oc-low-profile-751gh8hx9kxz-75igh8hx9kxz-75igh8hx9kxx-a1110002.html
> Specifications: http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications
> 
> .


The Maximum GPU temperature for Nvidia cards can be found at the bottom of the specifications page you linked there. In this case, its 95c


----------



## tictoc

Quote:


> Originally Posted by *DanHansenDK*
> 
> From the last post:
> 1. BTW I'm introducing Test System 6 aka "Whitley" which will be the same mobo and of course the same industrial Papst Fans, 2U PSU and 2U CPU cooler, but I'm thinking of testing an i7, just to see if I'm right in my assumption from earlier on. That the i7 wouldn't make that much of a difference in this "data chewing" case. What do you think?? Should we try such a CPU?? Or do you agree in my assumption?
> 
> 3. If you know where to find a ASUS version of this card and where to fin a good place to compare cards. I once had an address for a great site where important stuff like "Endures temperatures up to: " were to be found!! But I can't find it.
> 
> This card I need in a Asus version (I think) for Test System 7* still in Low Profile! (heat enduring qualities required)
> GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 x16 HDCP Ready Video Card
> Sample Vendor: http://skinflint.co.uk/kfa-galaxy-geforce-gtx-750-ti-oc-low-profile-751gh8hx9kxz-75igh8hx9kxz-75igh8hx9kxx-a1110002.html
> Specifications: http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications


*1.* Hyperthreading can increases your output, but the percentage increase will vary from project to project. In a system running strictly CPU work output can increase by as much as 30%. The increases in output will be very different from project to project due to the way that HT works. With your setup and the need to reserve CPU cores to feed the GPUs there will more than likely not be much difference between HT or no HT. Even if you do see an increase in output with HT, you will have to weigh that against the cost of HT (increased purchase price, added energy consumption, and with the increased energy consumption comes increased heat). Ultimately the only way to know if HT is worth it or not, in your setup, is to test it out and record the results.

*3. * ASUS does not make a low profile 750ti. The best looking low profile 750ti, that I have seen, is the MSI Low Profile 750ti. With the dual fan I would imagine that this card should run cooler and quieter.


----------



## DanHansenDK

Hi TicToc









So you agree with me more or less regarding the i7, right?! Yes, it's only a test which will show the difference. I'll make a system completely alike and install an i7..

The MSI card, with dual fan on a low profile!? Sounds just great. Actually I think it's the large heatsink on the Asus GT640 in combination with the Papst industrial fans which makes it work. Let's see how the heatsink on the MSI card looks. Great, thanks









Quote:


> One thing I'm surprised about is that it took you so long to take out those blank card slots and to remove the connection cover surrounding the motherboard connectors on the back of the case.


Hi Scubadiver,
Thanks I think, and welcome







I'm not quite sure what the "connection cover" is, but regarding the rear end brackets I've got an explanation








This version of the "Test Systems" were one of the first ones only containing 1 card, like post 10. By reading it all you will find that it's not before Test System 3 if I'm remembering correctly, that the heat becomes an issue








After that, we got to "Test System 3" now attempting to use 4 GPU's. Before this was possible we just had to solve the problem of crunching headless on multiple GPU's on Linux.
Around post 46 we learned that card4 got hot or the air around it did. Still there were no problems with the other cards. Installing to SPOT-air fans solved the problem and from this point and forward I new what to do.
This was all back in april and the summer in Denmark was comming. Therefore I decided to test it during the warm period and searched the market for better chassis fans. But, as you see in the picture below, there's no backside bracket mounted for GPU 4. GPU 4 was the GPU with the heat issues.



I can't remember when, but it was during the summer I decided to leave the rear brackets off. But, actually I was thinking about mounting the rear brackets again earlier on. Why? Because the rack in question is a 60 cm rack and the chassis is 55cm. Not much room for the air to be "sucked" up and out through. This rack only has a couple of 220volts top fans. Large but only 2 of them. I hoped to make the air flow blow out trough more than the backside. As you can see in the picture above, there's ventilation in both sides of the chassis as well. More in the right side than the left, but anyway, by splitting the hot air up into 3 areas or more I was hoping for a good result. I'm not so sure about this any more, but I'll test it, that's for sure. But this is all later on.


Quote:


> I'm sure that by leaving out those covers that you dropped a few degrees since the hotter air now had somewhere to go...fast.


Anyway you are quite right about those rear brackets. I'm sure they would have made a difference back then where the chassis fans were the original ones. I'm just not sure it would have been enough! The action which made the biggest impact on the heat issue/the results was the replacement of the chassis fans. Or that's what the tests showed anyway









Sorry if it's all a little confusing.. I'm no benchmark professor that's for sure









I've modified the chassis pretty much. As you see the PSU is an industrial 2U PSU with 2 fans. 1 in the front and 1 in the rear. Chassis fan number 1 is mounted directly in front of the PSU. This made me think about airflow etc. Actually the PSU to be used in this chassis is a standard ATX PSU. But it's placed in the front of the chassis with a 220volt wire running trough the system, passing through a teeny weeny hole along all kind of system wiring. And this would place a "block" in front of the largest air intake too. So because of this and the fact that I wanted to use an industrial 2U PSU, this was my solution to it all. I'll show the whole "fitting" of the PSU in a complete version of the ToDo when building system 8 and 9 which will be the last ones for this test.

Thanks for making a comment! Please don't hesitate to make another one when ever you feel like. All ideas and suggestions is more than welcome.

*Thanks TicToc...*
OK!! You found it for me as well!! Thanks my friend... Yes this looks really interesting, that's for sure!! I'll try and find a Danish supplier for Test System 7.
And if you look at this image, you can see that it's probably possible to remove the "plastic protection" so that the Papst fans can do their magic. Thanks TicToc









Answering Scubadivers question I read the whole thread and I stumbled over a question I didn't notice back then. "JetPack12" asked:
Quote:


> Also, (and more importantly wink.gif) it looks like you have a stack of three systems, why is this not 3 x 4 GPU => 12 GPU mega-rig! tongue.gif


Actually I've got 7 of these systems and several 1U systems as well. The reason why I didn't just "fill up" with GPU's where that the other mobo's where had only 1 and 2 PCI-e slots. And I've only just found a system which seems to work properly. Another reason is that I'm going to be at university for 3 more years and the $'s is not that great for the time being







That being said you can almost guess how much I'm spending on e.g. power and on bits and pieces. The power bill went up 3 times after beginning this project and each system cost roughly .... Let's see 1US$=6.00DKR:
Chassis 2U: 1300,- --> 216$
PSU 2u: 1400,- --> 233$
CPU: 1700,- --> 284$
CPU Cooler 2U: 550,- --> 92$
Mobo: 2000-2500,- --> 334$ (the mobo we are using now)
Memory: 600,- --> 100$
GT640: 4x 650,- = 2600,- --> 434$
SSD: 550,- --> 92$
DVD: 250,- --> 42$
About 11000,- --> 1835$
Well, actually I didn't think it was this much! The PSU went up and the mobo too, or else it would have been more than 200$ less. I'm not looking for any kind of complements of course not, but being a student this is a pretty large part of my income. My family supports me as well. If not, this wouldn't have been possible..
But JetPak, I am going to build 5 systems of this version and have those running the whole time. It would have been so nice to be able to order it all right now







Just have to do it bit by bit







And yes, it would have been a nice rig. Just love the idea too








But this is my way of supporting SETI.. and others.. By building and finding a semi-professional system which is not that expensive to build and still able to process a reasonable large bit of data. Maybe then someone will be inspired and build the system and that way support Boinc projects









.


----------



## DanHansenDK

Hi,

OK, I found a card in the kingdom, but it's a Zotac... I would very much like the one TicToc found! There's better heatsink and 2 fans. And it's 30-40% cheaper.. At least at newegg.com. I'm just not sure if they deliver to my part of the world









Zotac: https://www.dustinhome.dk/product/5010801993/geforce-gtx-750-ti-lp-grafikkort?ssel=false?csref=pricecompare_kelkoo&utm_campaign=kelkoo&utm_source=kelkoo.dk&utm_medium=pricecompare

MSI: http://www.newegg.com/Product/Product.aspx?Item=N82E16814127836

.


----------



## Tex1954

Howdy,

Great work going on here... but I have a question...

Why not build one server with full sized powerful GPU's like HD7970 or GTX 680 or something?

ONE of those full power GPUs put out more PPD than 6 of the small ones..


----------



## DanHansenDK

Hi Tex,
Quote:


> Why not build one server with full sized powerful GPU's like HD7970 or GTX 680 or something?
> ONE of those full power GPUs put out more PPD than 6 of the small ones..


When choosing GT640 I were not sure which one to choose. I had to find the card which endured heat the best. A guy from SETI told me about Asus GT640 and then I decided to try those out.
Are you sure GTX680 does 6x the work? I'm not.. I'll check again just to be sure, but if this is really true, then let's talk about it. Or is it the card we talked about earlier on? I'll read your earlier post again, check the number of CUDA cores on GTX680 and then I'll get back









Thanks for the help my friend









*STATUS 24.11.2014 - 09:32am:*
Can't find the post I thought about. Seemed to remember we talked about them earlier on, but I've read as far back as 10.2 and couldn't find it. I'll go and find the info again.
BTW "Test System 4 aka Halifax" hasn't been running a full month yet and it's crunching for both [email protected] and [email protected] (SETI is down for the time being as you might know). What I wanted to show you was that "Halifax" right now has a RAC or average per day: 44,974.91. Pretty good for an "old lady" driving a Volkswagen . "Beaufort" i as: 37,009.17









*STATUS 24.11.2014 - 09:32am:*
Hi TicToc, can't find the card, only in "normal" version. Do you know where to find it? And do you think its that much better than GTX750 low profile edition??? I have to consider the price as well you know and the GTX750 I can get for about 149US$ plus shipping of course









Found only these standard edition cards:
http://www.fudzilla.com/home/item/27240-point-of-view-tgt-gtx-680-ultracharg/27240-point-of-view-tgt-gtx-680-ultracharg?showall=1

Looking forward to hear from you









Found these tests on GTX750. Only one is a low profile card but I know that MSI has made one to as you guys showed it to me the other day








http://uk.hardware.info/reviews/5271/geforce-gtx-750-and-750-ti-asus-vs-msi-vs-kfa2

OK, I can see what you mean now







Has compared those to cards. But still, do you think we are able to find it in Low Profile?? I can't








GTX 680 compared to GTX680:
http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-680/specifications
http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications


----------



## Tex1954

Well, its been a while since I updated this, but have a look at...

https://docs.google.com/spreadsheet/ccc?key=0AmE73gddEiprdFBHMml3dmxqTjB4T21Mc3Mya3p1Snc&hl=en_US#gid=85

When SETI gets some more WUs online, I will run some on my GTX 670 for you and let you know how it does. On my 560 Ti cards (900MHz) they take average 11 minutes for new versions and the GTX 580s run much faster...


----------



## DanHansenDK

Hi Tex,

My GTX 770 has the exact same number of CUDA-cores as the GTX 680. I'll guess their performance will be much alike! The GTX 770 chews most jobs in about 8 minutes or so. I'm sorry to tell you that I'm not quite sure about .wu other than I know it's a larger workload for the system. It's when these "fellows" occur, "the heat is on"









I just noticed that GTX 750 Ti hasn't got that SLI stamp under "specifications" which makes me think about it's capability to be used in a multiple GPU installation!?!? This I have to check before going ahead with test 7. Sadly this month I have to by a calculator for uni, new winter tires for my little car and a present for my mother







So there will be no adding to the hardware park









Anyway, there's enough issues to be solved, so I'll live... I'm just finishing up the installation of my FileServer with Backup, and when that's done, I'll get right back to this. I've been delayed several times. Now I'm going to solve the shell-script issues and get it done. Watchdog* ShellScripts....

But, if you'll find any LowProfile edition cards you believe to be super good for the project, please let me know








Thanks for the help my friend









BTW, regarding the spreadsheet, can you explain some of it. What is it I'm looking at? Numbers of performance between the different cards, yes, but it's a little technical


----------



## Finrond

You don't need SLI for boinc crunching, SLI is a gaming technology. I believe you can have SLI enabled for boincing in some instances, but it does not improve performance.


----------



## Tex1954

Yup, the spread sheet was just clocks, times, and points for various projects. I have in mind for the future to redo it and update it for easier reading and searching... but not now.

In any case, I don't think you will find much better than what you have in a low profile card. But, for points and cost sake, maybe it makes more sense to do it the way you are.

Thing is, if you look at how long a GT 640 takes to do a task and PPD (Points Per Day), then compare 4 of those to ONE higher powered card like a 780 Ti or something, you may find one 780 Ti card will produced more PPD than 4 to 6 of the 640, not to mention the additional cost of the motherboard and such.

Just a thought for you...

I can imagine a motherboard with 3 or 4 780 Ti cards or 4 7970 cards in one system... it would require a 1200-1600 W PSU too... but it would put out more PPD than all yours systems combined..

Just thinking out loud mostly... It's a hobby for me and I suspect a hobby and learning experience for you as well.

One thing for sure, it is really fun watching this thread! And also, there is no such thing as a bad BOINC machine... they are ALL good!


----------



## DanHansenDK

Hi Finrond,
Quote:


> You don't need SLI for boinc crunching, SLI is a gaming technology. I believe you can have SLI enabled for boincing in some instances, but it does not improve performance.


So I'm able to CUDA the hell out 4 GPU's anyway??? Thanks in advance my friend









Hi Tex,
Quote:


> Yup, the spread sheet was just clocks, times, and points for various projects. I have in mind for the future to redo it and update it for easier reading and searching... but not now.


Maybe it's not needed. It's just that it's the first time I lay eyes on it and often when seeing things for the first time it's difficult to "understand" it








Quote:


> Thing is, if you look at how long a GT 640 takes to do a task and PPD (Points Per Day), then compare 4 of those to ONE higher powered card like a 780 Ti or something, you may find one 780 Ti card will produced more PPD than 4 to 6 of the 640, not to mention the additional cost of the motherboard and such.


Now I get it!!! Sorry I'm a bit slow! Yes, of course! You are completely right! I hadn't considered that! Well, this is a third variable in the calculation, which I of course have to keep in mind! Thanks for setting me straight!








Quote:


> I can imagine a motherboard with 3 or 4 780 Ti cards or 4 7970 cards in one system... it would require a 1200-1600 W PSU too... but it would put out more PPD than all yours systems combined..


Well, the thing is, that a "standard" PSU actually fits this 2U chassis! Therefore, if it's possible to get such an PSU, this system/chassis may be build using cards like that! But, the PSU will the get in the way of the air flow. The largest PSU I've found is this 550W 2U Industrial PSU which costed me "one month of eating oatmeal"







I'm not sure that it's even possible to find such a 2U PSU!?!? What I think we'll do, is to build this test system 6 as halifax and beaufort, then test system 7 with GTX750ti When all this is done and I've made the ToDo, then let's build this remarkable system using those "kick-ass" cards







Then we'll have some time to find the right vendor and the right PSU







What say you my friend









J
Quote:


> ust thinking out loud mostly... It's a hobby for me and I suspect a hobby and learning experience for you as well.


You are d... right it's a learning experience







When starting fiddling with Linux, I said to myself. Do you want to sit day in and day out reading books? Or do you wanna learn by doing! I think it's not that hard what I concluded







Quote:


> One thing for sure, it is really fun watching this thread! And also, there is no such thing as a bad BOINC machine... they are ALL good!


Well, you guys were the only ones who showed really interest. I've been writing several places to begin with, but it's not everywhere stuff like this is appreciated. There's a few guys from Berkeley who shows interest and who helped me in the process of getting here, so I'll post the ToDo when it's finished there's as well







Just to be fair









BTW The average for the 2 fully mounted test systems has reached an daily average of:

halifax work 65,812.33
beaufort work 53,172.02


----------



## DanHansenDK

Hello guys









I was wondering if you might think a little about the "BIOS Fan Configuration" of the CPU fan and the 4 Papst chassis fans!? I'm asking because I'm not sure what the right to do is! I've been using some time to figure this out. If you look back to hese posts, you can see I'm trying to find the right setup.

Here's some of the test's: http://www.overclock.net/t/1467918/project-headless-linux-cli-multiple-gpu-boinc-server-ubuntu-server-12-04-4-14-04-1-64bit-using-gpus-from-geforce-gt610-640-to-crunch-data/150#post_23147508

1. OK, what's the problem here!?!? Regarding the chassis fan's I'm thinking, why set it up to run other than 100% all the time?? Why not just make them run "full throttle" all the time? The reason I was looking for "the perfect config." is because we might make the systems have a daily "brake". Some time where the system is not working.. Another reason is that if it's cold, if it's a better rack and the temperatures are not that high, maybe it's not necessary for the system to have the chassis fans work that much. On the other hand why not? The fans are build for it and not that many mobo's has 4+ fan connectors!?!?

2. Regarding the CPU, I think it's a good idea for this particular 2U fan to have some sort of a "resting mode" Not much, only as low as 80%, but this is because this 2U fan is good for more than 10.000RPM !! And running some tests earlier on, as I showed in here, it didn't change the CPU temperature that much. Not before reaching a very low RPM. 40 or 50% that was!

Please let me know what your thoughts are. I know you are the masters of CPU & GPU heat and fans removers


----------



## DanHansenDK

BTW Tex,

Can I supply you with some data from GT640, GTX770 so that you can get this in your spreadsheat?? If, just say the word. Of course, if it's only for personally use it's not needed









*
Status for the upcoming "Test System 7" - 27.11.14 03:09pm:*
I've just looked at the power consumption of the card in question for "Test System 7"! It looks like the GTX750ti doesn't use that much more power! And it seems to endure heat the same great way. 4x GPU of this kind in this system uses 217W in average. We may just be able to run those new GPU's on the same P. If close to the limit, we'll take a step up the ladder of course. Here's the specs.:

GT640 Thermal and Power Specs:
Maximum GPU Temperature (in C) 98 C 95 C
Maximum Graphics Card Power (W) 65 W 49 W
Minimum System Power Requirement (W) 350 W 300 W

GTX750Ti Thermal and Power Specs:
Maximum GPU Tempurature (in C) 95 C
Graphics Card Power (W) 60 W
Minimum System Power Requirement (W) 300 W


----------



## DanHansenDK

OK, regarding those GPU's for Test System 7 I've been able to find these so far. This with help from TicToc, Tex, Finrond etc.:

MSI GTX750 Ti Low Profile:
http://www.newegg.com/Product/Product.aspx?Item=N82E16814127836&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=6146846&SID=1ep3cbhqf302l

KFA2 (Galaxy) GTX750 Ti Low Profile:
http://uk.hardware.info/reviews/5271/geforce-gtx-750-and-750-ti-asus-vs-msi-vs-kfa2

Zotac GTX750 Ti Low Profile:
http://www.pricerunner.dk/pi/37-3015404/Grafikkort/Zotac-GeForce-GTX-750-Ti-LP-%28ZT-70606-10M%29-Produkt-Info

This Low Profile GPU has got the double amount of CUDA cores compared to GT640, well, almost








http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications
http://www.geforce.com/hardware/desktop-gpus/geforce-gt640/specifications


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> BTW Tex,
> 
> Can I supply you with some data from GT640, GTX770 so that you can get this in your spreadsheat?? If, just say the word. Of course, if it's only for personally use it's not needed


Sure, put it in the spreadsheet on page 1 and I will update things..

But, it is EASY to calculate them your self with any spreadsheet or calculator.

1440 minutes in a day DIVIDED by the minutes it takes to complete a task equals Tasks per day.

Tasks per day TIMES Points Per Task equals PPD,

If you run two tasks per GPU, then use 2880 OR multiply the PPD times two.










Happy Thanks Giving!


----------



## DanHansenDK

*Status"Test System 6 & 7 aka Halifax & Beaufort" - 28.11.14 07:48pm:*

I'm somehow impressed! As you can see, "Halifax" has only been at it for nearly a month and "Beaufort" 18 days! Already they've done this much for [email protected] . They've done some work for [email protected] as well, until it crashed and "bruno" needed first-aid







So it's not 50/50 right now, and that's the reason why the RAC increases that much at [email protected] Anyway, I think it's looking good. I'm looking very much forward to se how they'll manage, when both projects are up and running. Should be the same, but it isn't. I don't know why, but I seem to remember, that this system does it better working Asteroid work units.

Average daily and total since system upstart:
31 Oct 2014, 2:25:43 UTC --> halifax work 68,554.96 1,365,600
10 Nov 2014, 4:27:50 UTC --> beaufort work 54,897.58 1,024,320

Hi Tex,
Quote:


> Sure, put it in the spreadsheet on page 1 and I will update things..


I'll have to have a little help when doing that







Let's look at it a little later on








Quote:


> Happy Thanks Giving!


Happy thanks giving to all you nice Americans









_Here's a re-run







_
_It's a little important so I would love to hear from you guys







_

I was wondering if you might think a little about the "BIOS Fan Configuration" of the CPU fan and the 4 Papst chassis fans!? I'm asking because I'm not sure what the right to do is! I've been using some time to figure this out. If you look back to hese posts, you can see I'm trying to find the right setup.

Here's some of the test's: http://www.overclock.net/t/1467918/project-headless-linux-cli-multiple-gpu-boinc-server-ubuntu-server-12-04-4-14-04-1-64bit-using-gpus-from-geforce-gt610-640-to-crunch-data/150#post_23147508

1. OK, what's the problem here!?!? Regarding the chassis fan's I'm thinking, why set it up to run other than 100% all the time?? Why not just make them run "full throttle" all the time? The reason I was looking for "the perfect config." is because we might make the systems have a daily "brake". Some time where the system is not working.. Another reason is that if it's cold, if it's a better rack and the temperatures are not that high, maybe it's not necessary for the system to have the chassis fans work that much. On the other hand why not? The fans are build for it and not that many mobo's has 4+ fan connectors!?!?

2. Regarding the CPU, I think it's a good idea for this particular 2U fan to have some sort of a "resting mode" Not much, only as low as 80%, but this is because this 2U fan is good for more than 10.000RPM !! And running some tests earlier on, as I showed in here, it didn't change the CPU temperature that much. Not before reaching a very low RPM. 40 or 50% that was!

Please let me know what your thoughts are. I know you are the masters of CPU & GPU heat and fans removers thumb.gif

.


----------



## DanHansenDK

*STATUS "Test System 4" aka "Halifax" - 02.12.2014 02:30pm:*
Still running perfectly, even though it's the standard BIOS cooling config. I'm impressed. Think we got ourselves system "Mark1"









Code:



Code:


# uptime
 14:21:27 up 29 days, 21:55,  1 user,  load average: 5.22, 4.96, 4.86

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 46 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +51.0°C  (high = +80.0°C, crit = +100.0°C)

*STATUS "Test System 5" aka "Beaufort" - 02.12.2014 02:31am:*

Code:



Code:


# uptime
 14:28:30 up 18 days, 10:04,  1 user,  load average: 4.86, 4.88, 4.85

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 44 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 48 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 45 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +59.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +57.0°C  (high = +80.0°C, crit = +100.0°C)

.


----------



## DanHansenDK

Hello Guys,

Hope you are doing well!!!

Test System 6 aka Wellington is being put together right now! Test system 5 aka Beaufort will be test-system for Headless Linux CLI Multiple GPU Boinc Server version 2 . I've just received the card which will be tested, the card is this: "GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 x16" - Link: http://www.kfa2.com/750tioc.shtml
The reason why Beaufort will be "converted" to test-system for this version 2 is because of the larger industrial 2U PSU. This has got 500W and I think this is enough for 4 of these new cards. Version 1 is running at 100% using a i5-4690K CPU and 4x GT640 LowProfile and use less than 300watts. Actually its using as little as 230watts. This new card from KFA2 has got twice as many CUDa-cores and still it use very little power. I lowe these LowProfile cards with low power consumption.



Lets see how it goes









*Status "Test System 6" aka "Wellington" - 01.04.15 22:54pm:*

......


----------



## tictoc

Awesome to see an update.









The 750ti that you are going with, is the most powerful low profile consumer card available. It should be a good performer, and Maxwell sips power, so the 500W should be up to the task as long as you aren't overvolting/overclocking the GPUs.


----------



## Finrond

Can't wait to see the results!


----------



## DanHansenDK

Hi TicToc & Finrond









Thanks guys!! Yes, we are going to launch version 2 of this Semi SuperCruncher







Regarding the GPU it sounds really good! I hope it will run 4 of those under 500watts and no, no overclocking. I'm trying to construct a lasting system







Therefore I think it's best to keep it standard. Also because of the fact that these systems are going to run at 100% 24/7/365









A couple of hours and then the system will be done. In this first version I will test 1 of the new cards. To see how it runs and to see how hot it gets on. I'll throw in a couple of GT640 to make it nice and "stuffed" making the temperature rise







But, from the sound of it, according to your say so, I might as well have bought all 4 to begin with. Then again, it's double up on the price, so maybe it's OK to try it first









Kind Regards,
Dan


----------



## DanHansenDK

*Status "Test System 6" aka "Wellington" - 03.04.15 06:29am:*
_....formerly known as..._

Correction!! "Test System 6" aka "Wellington" will be renamed to "Beaufort". This because the 500watts 2U PSU has been mounted in this and its much easier to rebuild this case using the existing PSU. Therefore the new Test System 6 will be renamed to "Beaufort". "Test System 4" formerly known as "Beaufort" will from now on be called "Wellington". This means Test System 4 and 5, the ones you has been following is called: "Test System 5" aka "Halifax" and "Test System 4 aka "Wellington" SOrry for the inconvenience









*Status "Test System 6" aka "Beaufort" - 03.04.15 06:36am:*
This is what we are going to use in version:

1x ASRock OC Formula Z97
Intel i5-4690K
Industrial 2U Cooler from JAC
Industrial 2U PSU ATX500W
4x Industrial Fan's Pabst rpm3600
1x 4Gb Kingston HyperX Genesis X2 Grey S.
KFA2 GTX750TI-GDDR5 PCIe 3.0 x16
KFA2 GTX750TI-GDDR5 PCIe 3.0 x16 (if test is satisfying)
KFA2 GTX750TI-GDDR5 PCIe 3.0 x16 (if test is satisfying)
KFA2 GTX750TI-GDDR5 PCIe 3.0 x16 (if test is satisfying)

Problem!
This MoBo has only got 3 PCIe 3.0 slots and the 4'th is a PCIe 2.0! The GPU/Graphic Card is a PCIe 3.0 type! Had it been the other way around it would have meant nothing, but what about this situation!?!?

These are the items we are going to use in this version 2



The GPU is KFA2 GTX750TI-GDDR5 PCIe 3.0 x16 & the CPU is one of my favourites, the Intel i5-4690K



The MoBo is 1x ASRock OC Formula Z97. This control 4 Papst onboard!


----------



## Tex1954

Quote:


> Problem!
> This MoBo has only got 3 PCIe 3.0 slots and the 4'th is a PCIe 2.0! The GPU/Graphic Card is a PCIe 3.0 type! Had it been the other way around it would have meant nothing, but what about this situation!?!?


Well, likely the boards will saturate even if the PCIe was running x8 mode. I would not worry about it unless you are running Einstein or SETI...

Actually, I'm curious myself how 4 would perform and look forward to your results!










As an Aside, I have an HD 6990 PCIe 2.0 card (dual 6970 GPU's) and a full speed PCIe 2.- x16 buss isn't fast enough to supply both GPU's running Einstein... But, the 6990 does fine running most everything else except Moo! Wrapper which tries to use BOTH gpu's at the same time. Einstein is about the worst PCIe bandwidth hog on the planet with AMD GPU's running OpenCL...


----------



## DanHansenDK

Hi Tex,

_".....I would not worry about it unless you are running Einstein or SETI..."_

Weeeeee.....







That's exactly what I doo: Boinc/SETI ;( Well, I'll try even though it may not work. We have to









_"....Actually, I'm curious myself how 4 would perform and look forward to your results!.."_
Thanks! And me too







I'm guessing I'll be ready to test the first card in an hour or two


----------



## DanHansenDK

*Status "Test System 4" aka "Wellingtont" (reinstalled) - 03.04.15 12:08am:*

IMPORTANT!!!
CUDA 6.5 is no longer valid!! Version 7.0 is out and about! I'm testing the installation of version 7 to see if it's enough just to change the file information or if we have to change the ToDo completely. So far it looks OK! Installation is running after midifying the file information only. And we are still using the .deb package network installation file of course! No .run for us











There's e few hickups with this reinstallation and CUDa 7.0 implementation! Even though the 4 GPU's are OK and available, GPU tasks doesn't seem to be downloaded and run!?!?

Code:



Code:


# nvidia-smi -a | grep GPU
Attached GPUs                       : 4
GPU 0000:01:00.0
    GPU UUID                        : GPU-b377014d-90df-a691-fb4d-09b7a201847d
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 26 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:02:00.0
    GPU UUID                        : GPU-d5813be2-bf30-6c90-a591-90fef765984f
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 26 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:03:00.0
    GPU UUID                        : GPU-bf213a08-c3c6-346b-53ff-5ff7d82c5c74
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 26 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
GPU 0000:0C:00.0
    GPU UUID                        : GPU-44201af4-7401-6976-bb2f-9f3ecc011ef6
    MultiGPU Board                  : N/A
    GPU Operation Mode
        GPU Link Info
        GPU Current Temp            : 24 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Here's the log from Boinc/BoincTasks:



As you can see CUDA seems to be ready and GPU's accepted!!! But still it doesn't run GPU tasks!?!?
When connecting to a project I'm using the cmd tll "boinccmd". I've got at "standard" or "default" setup which tells boinc to run CPU's at 50%. Right after attaching to a project I change the status for the computer to "Work". And the setup for work is 10% for the CPU and all GPU's. Afterwards I run a "boinccmd ........ update" once again. And then, in the past the system just took of!! But now it doesn't #%&/%¤#"#¤%&.
If you know what this might be, then please let me know. I hate to search for errors if the answer is know to vereybody else








NB! In the logfile it say "Don't use GPU while active" !?!? I haven't told the system or the default setup anything about doing anything to the GPU's !!! This might be the problem, I just can't find the reason for this!!!

Here some more from the logfile. The work for NVIDIA is requested, but when trying to download those job it all goes to (/&%¤#¤%&/(



*And here's the proof!!!*
Jobs are being downloaded but fails to work. Is it during downloading? What seems to be the problem?





And for SETI no jobs are being downloaded either!?!? Usually there were some jobs for AstroPulse v.7.04 (Linux/GPU Jobs). But not on this system, not anymore!?!?



*Status "Test System 6" aka "Beaufort" - 03.04.15 01:50pm:*
Assembled and getting there







I'm making the GPU fit Low Profile Systems







I'll drop a few images in a few


----------



## Tex1954

Well, looks like your "midified" file is doing the job!

Can't wait to see how well it runs tasks...


----------



## DanHansenDK

*Status "Test System 6" aka "Beaufort" - 03.04.15 03:02pm:*

Sorry guys!!! I'm so sorry!!







I've been working around the clock and the system is ready to be tested with the new GPU!! But, eithe SETI or Asteroid can provide GPU jobs for us!! I've been trying everything and there's still issues.. If you know a project which haven't got issues with Linux GPU Jobs, please don't hesitate to let me know and I'll join the project right away. I'm all for science and space projects but then again, we'll have to have something to test with. So if you know a project which has those jobs and haven't got the issues, please let me know









KR/Dan

Hi Tex,
Know a project which offers work-units based on Linux (GPU jobs)







?????

*Status "Test System 6" aka "Beaufort" - 03.04.15 03:27pm:*
This is how you make this GTX750TIslim fit the 2U chassis! Use the Low Profile Brackets (which came along in the package!!!) _*not an Asus card Here you'll have to buy the brackes for a Low Profile edition card!?!?_ Anyroad, fit the bracket (brackets if you need the VGA port! If not, unplug the VGA connector to give room for airflow











This is how it looks when done! Please notice the 2 small screws which has to be unscrewed before the standard bracket can be replaced! Use a PZ1 bit for the first screw and PZ2 for the second!! Please notice that you will brake the smallest screw if you do not do this!! You can use a "strip" to fix the loose wire and keep it from getting stock somewhere.



System ready for test. Now we just need some Linux Jobs for out new GPU











Looks nice and tight! If it runs and if it keeps it's wiz I think we can make good use of it. I'm just a little worried about the heat issues. If this card can take it!? 2 GT640 cards has died because of this and those cards were not even close to the limits, as you know! Asus was suppose to be a "better" card or better quality I was told! Let's se about that







(when we get some work)











.


----------



## Finrond

Asteroids has been having download errors for about a week now. Seti might just be low on tasks I am not sure.


----------



## DanHansenDK

Hi Finrond









Thanks!!! Well, it's not the first time. I'm not very lucky when it comes to getting work-units whenever a new system is ready... This is the third time in a row







It's OK, but it's just because I'm so excited and wan't to test this new card. You spent 1000$ on a new system and of course you want to see what you bought and get your moneys worth









Any suggestions on a project which HAS GOT GPU work-units for Linux









BTW! How many i5 2500 have you got and what do you want for each??? Fans are not needed









KR/Dan


----------



## Finrond

Just the one. It is my main gaming system.


----------



## DanHansenDK

Hello again FinRonD <----










OK, it's just that it said "(13 items)"









OK, It's because I'm looking for a good deal on 4 cpu's of the same kind. I'll prefer Socket1155, but almost anything goes. It's for my mobile office









Did any of you know a project which has working GPU units right now??? [email protected] ???? (CUDA and/or OpenCL GPU/Linux)








Found this, but maybe you know some better stuff. These pages are not very new







http://boincfaq.mundayweb.com/index.php?view=471&language=1

.


----------



## DanHansenDK

*Status "Test System 6" aka "Beaufort" - 03.04.15 21:12pm:*
HALLELUJA!!! We got a GPU Job!!!! And the card that got it was GTX750TI







Now we are rocking!



Let's see how the temperature looks so far:

Status after 10 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 28 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C   <-------------------- GTX750TI
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Status after 20 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 28 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

Status after 30 minutes:

Code:



Code:


# nvidia-smi -a | grep Temp
    Temperature
        GPU Current Temp            : 28 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 49 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

_ongoing work....._

.


----------



## Tex1954

Patience with SETI... the system had to sit and ask a few times before it got any tasks...

Collatz and PrimeGrid have work as well and PG will really stress your GPU's..

BTW, CONGRATS!!!!!


----------



## DanHansenDK

Hi Tex,

Thanks! I tried to join Einstein, but they had closed for new members ;( You could however JUST use your emailadress, but when you are using the Boinc Command Line Tool like I do, boincmd, then its not possible to "just use your email". So I'll have to wait for some more work from SETI and Asteroid to see the results. I got a few jobs last night, so I'll look at it and see if I can calculate some sort of performance


----------



## DanHansenDK

Hello again,

*Status "Test System 6" aka "Beaufort" - 04.04.20 31:12pm:*

And here it is:

GeForce GTX 750TI

Code:



Code:


Navn period_search_6709_1426501029.728701_338994_0
Workunit        29038172
Opret   17 Mar 2015, 2:26:55 UTC
Afsender        4 Apr 2015, 6:51:32 UTC
Report deadline 14 Apr 2015, 18:51:32 UTC
Received        4 Apr 2015, 11:23:10 UTC
Server state    Over
Outcome Success
Client state    Ingen
Exit status     0 (0x0)
Computere       128076
Run time        1 hours 9 min 51 sec     <----------------------------
CPU time        6 sec
Validate state  Valid
Opret   480.00
Device peak FLOPS       259.12 GFLOPS
Applikationer   Period Search Application v101.12 (cuda55)

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GTX 750
Compute capability: 5.0
Multiprocessors: 4
Grid dim: 64 = 4*16
Block dim: 128
12:22:02 (2027): called boinc_finish

GeForce GT 640

Code:



Code:


Navn period_search_1010_1426497677.081390_420624_1
Workunit        28683836
Opret   16 Mar 2015, 21:19:35 UTC
Afsender        28 Mar 2015, 22:36:09 UTC
Report deadline 8 Apr 2015, 10:36:09 UTC
Received        29 Mar 2015, 11:06:10 UTC
Server state    Over
Outcome Success
Client state    Ingen
Exit status     0 (0x0)
Computere       82538
Run time        2 hours 33 min 2 sec     <----------------------------
CPU time        24 sec
Validate state  Valid
Opret   480.00
Device peak FLOPS       176.69 GFLOPS
Applikationer   Period Search Application v101.12 (cuda55)

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<stderr_txt>
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GT 640
Compute capability: 3.5
Multiprocessors: 2
Grid dim: 32 = 2*16
Block dim: 128
12:54:23 (4672): called boinc_finish

Seems to be quite as I expected! Theres a little more than twice the amount of CUDA-cores on this new card, but then again, nobody new ihow it would handle this! Whats really interesting, is how it does compared to the very expensive GTX470







I'll take a look later on









And here's the results for the card 4 times as expensive. The numbers varied a few minutes. I chose one in between:

GeForce GTX 470

Code:



Code:


Navn period_search_1010_1426497677.081390_424175_1
Workunit        28687387
Opret   16 Mar 2015, 21:23:09 UTC
Afsender        28 Mar 2015, 23:47:09 UTC
Report deadline 8 Apr 2015, 11:47:09 UTC
Received        29 Mar 2015, 1:35:31 UTC
Server state    Over
Outcome Success
Client state    Ingen
Exit status     0 (0x0)
Computere       118083
Run time        46 min 54 sec     <----------------------------
CPU time        5 sec
Validate state  Valid
Opret   480.00
Device peak FLOPS       750.56 GFLOPS
Applikationer   Period Search Application v101.12 (cuda55)

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
CUDA RC12!!!!!!!!!!
CUDA Device number: 0
CUDA Device: GeForce GTX 770
Compute capability: 3.0
Multiprocessors: 8
Grid dim: 128 = 8*16
Block dim: 128
01:35:09 (5568): called boinc_finish

KR
Dan


----------



## Tex1954

Well, so far so good looks like...

You know, I use removable SSD's to power my stuff and can go from Winderz to Linux with a simple SSD swap. To really see how your systems are performing, you could consider using a Windows 7 install on an SSD so you would be able to use tools like MSI AB and such to see in detail how things are working. Once you are satisfied that the "hardware" setup is performing to expectations, simply revert to your Linux setup.

And as far as attaching to Einstein via CLI, if you already have an account there, I think you can use your project ID or something to attach if normal email stuff doesn't work...

Also, CHEAP Corsair refurb SSD's are now under warranty and Corsair will swap it out fast... they have done that now for me a couple times... (I have a lot of them).

Keep us updated! Love it!










PS: For testing purposes I set buffer to ZERO so BOINC only fetches enough work to load up the available GPU's / CPU's and nothing extra...


----------



## DanHansenDK

Hi Tex,

_".....you could consider using a Windows 7 install on an SSD so you would be able to use tools like MSI AB and such to see in detail how things are working. Once you are satisfied that the "hardware" setup is performing to expectations, simply revert to your Linux setup..."
_
OK... Thanks.. Great idea!!! But my systems are all mounted in a RACK with no acces other than SSH. This is why I wan't Linux. And because I don't want another 200$ of expenses for each system as well







Anyway, it's a good idea and I think I'm going to test it using a USBpendrive/bootable.. This way I don't have to change the configuration of the systems, just as you describe







BTW, I use only SSD for my systems. No spinning disks for me on a systems like this. Only when it comes to FileServers and BackupSystems there stuff to gain using those kind of drives. Or at least, thats my opinion









I succeeded in joining Einstein using only CLI. _"..I did it this way...":_

To create a account for Einstein using CLI

Code:



Code:


Command: # boinccmd --create_account http://einstein.phys.uwm.edu/ [email protected] mypassword theusernameiwant

Reply from CLI Einsten:
# account key: 722453269c2226a71e7f0e76465764212

Use the ID/account key to attach to the project:
Command: # boinccmd --project_attach http://einstein.phys.uwm.edu/ 722453269c2226a71e7f0e76465764212

Remember! If you are using password and port other than the default, add "--host 192.168.x.xx:xxxxx --passwd mypassword" after "boinccmd" but before everything else









*Status "Test System 6" aka "Beaufort" - 05.04.20 10:27pm:*

GTX750TI chewing Einstein GPUjobs







Nice!!

Code:



Code:


Navn PM0014_03261_314_1
Workunit        215168368
Opret   4 Apr 2015 21:15:14 UTC
Sent    5 Apr 2015 0:30:02 UTC
Received        5 Apr 2015 15:31:10 UTC
Server state    Over
Outcome Success
Client state    Ingen
Exit status     0 (0x0)
Computer ID     11799337
Report deadline 19 Apr 2015 0:30:02 UTC
Run time        27,503.21
CPU time        4,720.82
Validate state  Valid
Claimed credit  81.67
Granted credit  4,400.00
application version     Binary Radio Pulsar Search (Parkes PMPS XT) v1.39 (BRP5-cuda32-nv270)
Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
[08:52:10][2884][INFO ] Application startup - thank you for supporting [email protected]!
[08:52:10][2884][INFO ] Starting data processing...
[08:52:10][2884][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 34 MB (2015 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[08:52:10][2884][INFO ] Using CUDA device #0 "GeForce GTX 750" (0 CUDA cores / 0.00 GFLOPS)
[08:52:10][2884][INFO ] Version of installed CUDA driver: 6050
[08:52:10][2884][INFO ] Version of CUDA driver API used: 3020
[08:52:10][2884][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[08:52:10][2884][INFO ] Header contents:
------> Original WAPP file: ./PM0014_03261_DM1164.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 50690.188756622229
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 162813.7615
------> DEC (J2000): -505112.763
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4887493
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 1164 cm^-3 pc
------> Scale factor: 1.36364
[08:52:11][2884][INFO ] Seed for random number generator is 1086045116.
[08:52:12][2884][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-08
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[08:52:12][2884][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 169 MB (1880 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 135 MB


----------



## DanHansenDK

Hi Guys,

At the same time we'll now test Wellington (Test System 4) & 4x GT640 running Einstein. Let's see what happens with the temp. This is a brand new project, newer been tried before. I can't see why it should be any different from running SETI and/or Asteroid, ooops.... Well, there is the difference that Einstein actually has got jobs/workunits for Linux/GPU's







Just kidding









A great way to keep an eye on things testing it, is to run this command using Putty/SSH and keep a little CLI window opened showing the results. This will continue showing the 4 GPU's temperature. Of course a shellscript will be better, but I haven't finished those yet









Code:



Code:


# nvidia-smi -a -l | grep Temp

And here's how you attach to Einstein using only CLI/SSH, again:

Code:



Code:


Command: # boinccmd --create_account http://einstein.phys.uwm.edu/ [email protected] mypassword theusernameiwant

Reply from CLI Einsten:
# account key: 722453269c2226a71e7f0e76465764212

Use the ID/account key to attach to the project:
Command: # boinccmd --project_attach http://einstein.phys.uwm.edu/ 722453269c2226a71e7f0e76465764212


----------



## DanHansenDK

*Status "Test System 4" aka "Wellington" - 05.04.20 10:47pm:*
Well, we are running to test systems now. And I'm thinking on booting Halifax as well. But for now lets concentrate on the 2 systems running this new GPUjobs from Einsteing. I'm running a "WatchWindow" to be sure the temp. doesn't exceed 50degrees Celsius. I don't like 50+







I've been changing the onboard fancontro setup for the 4 Papst chassis fans! They are all running at 90% now until the setting "critical" is reached. Then of course they will run at 100%. Just like the CPUcooler







This is only possible because we are not using "standard" chassis fans and "standard" CPU coolers









Wellington chewing Einstein GPU jobs. I wish I've join them earlier on











So far so f...... great











.


----------



## Tex1954

Wow! I like screens like that!!! Impressive!!!

Wait until ALL your setups running and a single display screen won't be large enough to see them all... kinda like this:



LOL!

Keep up the great work!!!


----------



## Finrond

Good god Tex! So many tasks.


----------



## Tex1954

Quote:


> Originally Posted by *Finrond*
> 
> Good god Tex! So many tasks.


That isn't all of them...

LOL!


----------



## DanHansenDK

Hi Tex,

I'll boot them all, and then lets see who's got the largest one









OK. I caounted them, and I'll have 29 running when turning all test systems on. Adding my 2 developer systems I'll get another 8, but this is not systems which is planned for Boinc. It's developer systems which I need for learning and developing stuff. Not sure I wan't those running all the time. So I'm happy to announce that you win Tex'









_
"....Wow! I like screens like that!!! Impressive!!!.."_
Well, join the Linux environment and you'll find a lot of that!! There's packages which makes the text a colorful and nice to look at. It's not at all that bad as you might think. But, I still like my windows box to handle graphical programs etc. etc. from. I tried the Linux GUI, but I'm no professor there either, so I'm sticking to Win for a while. That being said, I am running 14.04 Ubuntu on my notebook and netbook. And I'm running a cairo-dock (mac look alike graphical menu). This I love. The only thing I haven't been able to get to work, is an internal SIM-card reader/3G-WAN on my netbook. There is that possibility that it's an empty socket. You know, without any 3G-WAN/Modem. But this is another question in another forum









I am planning a kind of intranet, which can be reached from the web. Showing the results of the shell-scripts that is watching and controlling all systems. But this is after I've done version 2.


----------



## Tex1954

LOL!

I am not trying to compete with you DanHansenDK, just enthused to see so many cores at work... I like that stuff!

Magic and some others have a ton or cores crunching too...

My goal is to eventually have about 3 times the cores running that I do now...


----------



## DanHansenDK

Hi Tex, Ohh boy... Had hoped for a nice little competition








No, I know that Tex... I know that... Sometimes it's a little hard to express yourself when you are writing in a language that's not your own... So, I'm sorry if it's a little confusing at times









Tex, which CPU and GPU do you want? Anything goes or? Do you want it in a rack? or normal chassis? If it's only 1 CPU & 1 GPU, then I've designed a nice little 1U chassis only 4.4cm high. Made my own bracket for extra cooling insite too









Anyroad, I'm trying to reach as many working units as well. It's just that I'm a little scared of mounting them all in the rack. Why, because there's only 2 large fans in the top and very little space in the back of the chassis. Therefore I need some real good ventilation in that rack. And, in the summertime, the air which is "sucked" in the rack is very hot, then problems may occur. I would have preferred not to have to build a chilled server-room. Any ideas???









*Status "Test System 6" aka "Beaufort" - 07.04.15 11:08am:*
Anybody who has an idea why this one GPU out of 2 doesn't run? It's not defect, but it's another model of course! We are, as you well know, testing GTX750TI and at the same time I mounted a GT640. But, GT640 is not running. Here's the log:



GT640 is mounted in the first out of 4 slots and GTX750TI is mounted in the third out of 4. May this be it?? It's just that GTX750TI rund, right??

I did, of course setup all GPU's to run:

Code:



Code:


#21 Install "linux-image-extra" and x11-xserver-utils. Needed by the nvidia module and we need the X tools later on:
Command: # apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

#22 Once installed we need to load the nvidia module. Execute the following command:
Command: # modprobe nvidia

#23 Enable headless mode this way: 
Command: # nvidia-xconfig --enable-all-gpus

But, this installation iS NOT new. It's running on the elderly CUDA 6.5. It's just that GTX750TI runs!!!

KR
Dan


----------



## Tex1954

Good grief, I have no idea about your stuff... I don't use CLI and such...

BUT, edit your config file... make sure the USE all GPU's is set to 1... and if the option isn't in there, add it...



Sigh... you worked out every problem so far if that doesn't work... I'm sure you will get this fixed in due time. I have confidence in you!

So far as the other systems I want to bring online, already have them... enough to bring about another 100 cores online... LOL!


----------



## DanHansenDK

Hi Tex,

_"....Sigh... you worked out every problem so far if that doesn't work... I'm sure you will get this fixed in due time. I have confidence in you!..."_
Thanks man







Thanks... Well, I'll take a look at it again, but I'll reinstall the system first. I'll wait til those 72 hours of Burn-In test has passed. Then I'll go for it







Actually the cc_config.xml file is on my ToFix list .... Because, when I added that to my configuration, all GPU's stopped running. Right after I asked them nicely to run all of them, using that .xml file







But you are right, it may be the issue.. So, please don't say you don't now these systems, because you wouldn't have been able to answer the question if you didn't









Looking forward to see those CPU's/GPU's crunching. As you know, I'm designing a low-cost sunpower/batteri system, so that the electrical bill wouldn't be that large! I'm trying to make one for people who lives in a flat too. Both solar and wind. Wait and see... I've got version 0.01 ready. Just need a few components to behave. Actually I've got a version of it powering my mobile office.. I'll show it later on. Just want to build a system which takes the edge of the very large electrical bills







I shut-down 70 % of my machines for 1 month and the bill shrunk to 1/3 . From about US$500 to US$150. Didn't expect it to be that much









Well, OK... Let's move on


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Tex,
> 
> _"....Sigh... you worked out every problem so far if that doesn't work... I'm sure you will get this fixed in due time. I have confidence in you!..."_
> Looking forward to see those CPU's/GPU's crunching. As you know, I'm designing a low-cost sunpower/batteri system, so that the electrical bill wouldn't be that large! I'm trying to make one for people who lives in a flat too. Both solar and wind. Wait and see... I've got version 0.01 ready. Just need a few components to behave. Actually I've got a version of it powering my mobile office.. I'll show it later on. Just want to build a system which takes the edge of the very large electrical bills


Ok I'm impressed!

my guess is the cc_config is the issue. Hope that's it and not something else!


----------



## DanHansenDK

Hi Finrond,

Not using the cc_config.xml file at all. Because this stopped 4 working GPU's on Halifax and Wellington from working last year. But, since you are both mentioning it I'll certainly try it out. It IS a part of my plan to use cc_config.xml, and has always been that, but I wanted to wait until other stuff was solved. But, I'll try it out, it may be it. It's just that it worked with 4 GPU's of the same kind without the cc_config.xml file. Actually I'm not sure how to combine app_config.xml, cc_config.xl, global override &% prefs and/or boinc standard setting AND the 4'th factor boincview or another kind of manager/GUI controller!?!?!? But, I WILL get there. It might take another year, but we'll get there







This will be with AndroBoinc to see jobs from outside on your mobile even though we are "hiding" behind a firewall







We'' use some SSH hardening and some other than standard ports for controlling boinc...









I did just yank the card into a slot on a working system, without installing the system from scratch and without updating CUDA to a actual version. I think this may contribute to the fact that the 2 types of cards wouldn't run together... And that cc_config.xml.. I'll try that first, then the reinstallation.. Let's see what happens









Thanks guys,

*Status "Test System 6" aka "Beaufort" - 07.04.15 06:08pm:*



*Status "Test System 4" aka "Wellington" - 07.04.15 06:09am:*



KR
Dan

.


----------



## Finrond

The cc_config.xml file goes into the BOINC data directory (I know this location on windows machines, but not on Linux). If the only option you are setting is to use all GPUs, then it would look like this:

Code:



Code:


<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>


----------



## Tex1954

I may add, you can EDIT the cc_config from within BOINCTasks...


----------



## DanHansenDK

Hi Tex & Fin,

Thanks. I'll try it. I know cc_config.xml and location on Linux. The only reason I didn't try it was because it made a working system running 4 GPU's stop working







But, as we talked about, we'll try and see if it changes anything in this situation. First we try cc_config.xml, then I'll change the slot and make e.g. GTX750TI "master" and GT640 sit in another slot. Then I'll try changing GT640 and if it still doesn't work, I'll reinstall the system. It's running CUDA 6.5, and it may be a "thing" bugging around in that version









Yeeaaahhh!!! LOVE IT!!









*Status "Test System 6" aka "Beaufort" - 09.04.15 11:36am:*

_ongoing work...._


----------



## DanHansenDK

*Status "Test System 6" aka "Beaufort" - 13.04.15 03:14pm:*
Still running without problems. I'm beginning to like this GPU







Well, it's time to start modifying things and make the other test systems run properly. I'll start doing that tomorrow.

Here's the temperatures after 72 hours:

Test System 6 - Beaufort:
It is running a little hot and I'm a bit concerned about the temperature when stuffing it with all 4 GPU's... Will there be need for watercooling? 2 GT640 has died running for a little more than 6 months. Are these just "bad cards" or is it an error in the design?



Test System 4 - Wellington:


_
ongoing work...._


----------



## Tex1954

Looking good! Those little cards should never generate much heat, even overclocked to the max... Anything under 65C is really good.

I love tinkering myself... can't wait to tinker more....


----------



## DanHansenDK

Hello Crunchers,

*Status "Test System 6" aka "Beaufort" - 06.05.15 01:34pm:*

Happy days!! Test GPU number "2" has arrived (KFA2 GeForce GTX 750 OC Slim - 2GB) and will be installed this evening. System will be reinstalled to see if both KFA2 runs at the same time as the GT640 did !?! I'm looking forward to test it









I will perform a 72 hours Burn-in test right after finishing the reinstallation. This Cruncher will be running the Ubuntu Server 14.04/CUDA7.0 as well.

ToDo info:

Code:



Code:


HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: Ubuntu Server 14.04.1 64Bit
KERNEL: 3.13.0-32-generic
CUDA: CUDA 7.0
NVIDIA: v.340.29 
BOINC: v.7.2.42
TODO: v.1.1.10 03.04.15 11:49:00

.


----------



## DanHansenDK

*Status "Test System 6" aka "Beaufort" - 10.05.15 07:25am:*
OK, GeForce GTX 750ti number 2 has been installed! There's a few issues we'll have to discuss here. I'll try to show the problems and issues along the way. I'll perform a complete re-installation of Ubuntu Server 14.04 & CUDA7.0 to rule out software issues. After installation I'll perform a 48hours test of this second card. I'll only run a "short" test because of some of the issues I'm going to show you below









Preparing & installing the second KFA2 GTX 750ti GPU:
Here's the card as it comes in the box. We'll have to modify it to be able to install it in a "Low Profile" environment, so we'll do just that







Remove the bracket highlighted in red. Please notice the 2 small screws which need a screwdriver size PH1.



VGA connector or not?:
Now we'll remove the VGA port highlighted in green, unless you are going to connect your monitor to this GPU using a VGA connector. Locate your Low Profile bracket and prepare to mount this after the next step.



Cords out of the way:
Let's get rid of that cord... Just do whats done in the image







Can't find the word for "strip or kabelbinder" in english





Mount the Low Profile bracket:
Now mount the Low Profie bracket from before, so that this cards fits a Low Profile chassis



Problems...:
But theres a problem here... After installing the second KFA2 GTX 750ti card it's clear that there will not be room for 4 of these cards. This because the heatsink is larger than the one on GT640 and that it won't fit into the last SLI . This means, locate a smaller 2U PSU (probably not possible), install less GPU's (we don't like that) or switch to watercooling so save space. Let's look at the image, shall we!? there's certainly room for some kind of cooling-device where the standard PSU used to be. In the front right side that is. But water cooling??? Is it better to find a board with 3 SLI slots with more space between them??? Let's hear what you guys think!!!



Problems...:
In this image you can see that the space between GT640 and GTX 750ti is different. (Red triangle) There's 3 times more space between the GT640 GPU's than there is with the GTX 750ti's. (Green box) And there's no room for the 4'th GPU !!! This is the bad part of the story!!



Problems...:
There's another issue! The bracket seems to be made by a "left-handed" robot!! The screw doesn't fit. I've never seen this before!! This we can solve by modifying the bracket or simply use another brand.



Space issues and heat?!?:
First let's test the second card and see what happens when using GPU's this closely mounted. As you can see in the image there's not a lot of room here... I think it'll show...



Temperature watch - GTX 750ti d1 and d2...:
Result after 5 minutes of running at 100%


Temperature watch - GTX 750ti d1 and d2...:
Result after 10 minutes of running at 100%



Temperature watch - GTX 750ti d1 and d2...:
Result after 15 minutes of running at 100%



Temperature watch - GTX 750ti d1 and d2...:
Result after 30 minutes of running at 100%



Temperature watch - GTX 750ti d1 and d2...:
After 30 minutes we can see that the difference is less than 3-4 degress. But still, I don't like it! I hate the idea of one card "suffering" more than the other. It's just not right







Let's see what happens when some more time has passed by. I think I'll order the 3'rd card so that we can test this. This is BTW the reason why I only test one card at a time. You'll never know what's going to happen when using new hardware. That, and then the fact of me being a poor student









_Ongoing work....._

.


----------



## Tex1954

Wow, looking gooder! Temps look fine and it's normal for multi-card installations to have different temps unless they are water cooled. The card sucking hot air from the back of another always suffers if they are side by side. I would not worry about it at all.

Brackets are another issue... some fit better than others, just sloppy designs some of them. I've had to loosen all the screws in a motherboard to align things so they sort of fit better a few times myself.

Keep up the good work!


----------



## DanHansenDK

Hi Tex,

Thanks my friend









_"....Brackets are another issue... some fit better than others, just sloppy designs some of them....."_
Yes, but in this case they are "inverted". Look at the "cut" for the screws.. It's "right-handed" instead of the normal "left-handed"*
* Couldn't find a better word









Tex, what wouldt you suggest regarding the space issue!?!? There's only room for/to 3GTX750ti's !?!? If we were going to use watercooling, I think it would help and make the last and 4'th GPU fit !?!? Or, is this going to be a compromise and only 3 GPU's going to be used ??? I'm a little lost here. If any of you would/could help me find the eqiupment for watercooling to this model graphic card, then maybe we should try it!?!? As showed, there room for a cooler/sink if it's possible to get a model which is not so large. And which can be used for 4 GPU's. This is your area of knowhow, not mine. I hope you've got some ideas.









As you can see on this image, there's plenty of room where the original PSU once were placed or where suppose to be placed







And this is the most cooled area, the place with the most airflow !!! If we can find a watercooling unit which can fit here and "cool" 4 GPU's, then I think this will be a great Cruncher Model 2. Please keep in mind, that this mobo is "waterproof" or coated with some kind of film, so that water wouldn't damage it. I know there's plenty things that will go wrong when and if water flows in a system like this, but never the less it sound good







Waterprotected mobo and watercooling. I liiike











Or, if we'll decide to drop the watercooling and go for the 3 GPU's which there is room for, then maybe this model could come in handy. Please notice the black SLI/Port for graphic card in the middle. This would result in more room between the 3 GTX750ti cards. That is, if this black socket works like the others









*ASRock X99 OC Formula*

Chipset

Intel X99
Memory Quad Channel DDR4, 8 x DIMM, Max. 128 GB DDR3, 3400+(OC) / 2933(OC) / 2800(OC) / 2400(OC) / 2133 / 1866 Non-ECC, Un-buffered Memory
Multi-GPU

Supports NVIDIA 4-way SLI Technology
Supports AMD 4-way CrossFireX Technology

Slots

40 Lane CPU
5 x PCIe 3.0 x16 (x16, dual x16, x16/x16/x8, x8/x8/x16/x8, x8/x8/x8/x8)
1 x PCIe 2.0 (28 lane CPU)
28 Lane CPU
3 x PCI Express 3.0/2.0 x16 slots (x16, x16/x8, x8/x8/x8)
2 x PCI Express 2.0 x16 slot (max at x1 mode)
1 x PCI Express 2.0 slot (compatible with x1 and devices)
1 x M.2 slot Gen3 x4
1 x M.2 Gen2 x4



Please let me know what you think!!
Whats the right thing to do!? I would like to do the watercooling, as a project. because the air version is less effective. It's not that much better than Cruncher Model 1 (GT640) right?? I'm looking so much forward to hear from you









I'll buy the third GTX750ti this Friday and then we'll see how it fits. I know there's no room for a fourth, so we need to take a decision. I would very much like to hear your views on the matter









Here's a screendump showin "Beaufort" with 2x GTX750ti. About 10 degrees difference at this point. Has been running for several days now, so I think this is how it will do.



.


----------



## Tex1954

Well, seems EK made a 750tI Waterblock... don't know if it would fit your cards, would have to check with them.

Doing a watercool setup for ONLY the GPU's would probably be simple. You would only need one radiator and that could be placed outside the 2U box.

But, remember, when you watercool a card, you have to take it apart and that can void any warranty.... It's easy to screw something up and brick the card....

Seems to me, the best setup would be a dual or triple SLIM radiator, maybe one with a built in pump. Don't really need a reservoir except that water seems to go away somewhere somehow over long periods of time... You could use 3/8" hose too.... that cuts down on space.

If you look around, you can sometimes find deals for watercooling parts too... just keep in mind that some chemicals mess up pump bearings and lots of rads have damage from unskilled handling. You want to avoid those..

Still, it would cost you an additional $350-$450 to watercool 4 cards... If that isn't a problem, go for it!



EDIT:

PerformancePCS has *two* 750 Ti waterblocks... not sure if they would fit your cards though, you would have to check. This is an idea of how much it would cost for the basic parts... keep in mind, you need MORE waterblocks so add that to the total..



Also, you are right, I was in a hurry with slow connection... those brackets are completely wrong, like made backwards! LOL! Call the manufacturer or something and complain... maybe they can help...










PS: Running shuttle between Detroit and Canada right now... actually in Canada now. Deliver tomorrow then back and forth some more...... Back and forth... Up and Down... LOL


----------



## DanHansenDK

Hellooo Tex









_Doing a watercool setup for ONLY the GPU's would probably be simple. You would only need one radiator and that could be placed outside the 2U box._
OK, what about placing it where in the front right corner of the chassis?? I marked the place with 2 green arrows







It would be a great place regarding the airflow. This is the primary intake, and the flow is very good!
As I've highlighted in red box/arrow, there's so very little room for air. These heatsinks are larger than those on GT640. Much larger. There's no room for airflow, almost none.



And there's just no room for the 4'th GPU. It's such a shame. We'll never know how it would have went. Well, this was the reason for the decision of testing watercooling, so maybe it's not all that bad







We did test 4-5 different mobo's and 2 models of the OC Formula. This one is the only one that handles 4 GPU's continuously without crashing or getting errors using the Linux OS/CUDA/Nvidia driver. Anyway, even if we found a mobo with another design, the space the GPU's would still be the same. Only if we decreased the numbers of GPU's, and went for a special 3 SLI mobo the space between thos GPU's would increase. I showed an image of one of these mobo's earlier on I think. It's not a solution, it's a way around!! Not for us, right











_But, remember, when you watercool a card, you have to take it apart and that can void any warranty.... It's easy to screw something up and brick the card...._
Well, I've broke warranty up to many times along this project







The chassis has been cut, home made wire connection, 100% load on the GPU at all times... This is how we find the right parts.. The right parts "stay alive"







But thanks... You are quite right... And I found the site, http://www.performance-pcs.com/water-blocks-gpu . Thanks for learning me which parts I need, because this is all news to me









_Still, it would cost you an additional $350-$450 to watercool 4 cards... If that isn't a problem, go for it!_
Well, I'm attending University from august, so I'll be running a budget. Anyway, this doesn't scare me. Not at all. Actually I expected more than that. Well, expected may not be the right word here.... Feared seems to be more the word Ii was looking for









_PerformancePCS has two 750 Ti waterblocks... not sure if they would fit your cards though, you would have to check. This is an idea of how much it would cost for the basic parts... keep in mind, you need MORE waterblocks so add that to the total.._
Thanks a lot... I'll look into it. We may be ready to test this around 01.09.2015, I hope.

Well, the third card has been delivered and I'll install this and reinstall the OS. Actually I was wondering how it was able to run using these new cards using the old driver and CUDA







But it did. Anyway, I'm excited to see how it does with 3 of these cards. Well, Headless Linux CLI Multiple GPU Boinc Server version 1 is done and we are on the way with Headless Linux CLI Multiple GPU Boinc Server version 2. I will of course make a complete ToDo over version 1 and place it in the first page of this project







Just like with the "RAID FileServer & BackUp" project









OK, here we go... There's no need to take pictures and show how we'll install the card. I did that last time. I'll be showing the test results, the burn-in test and of couser again, when we are going to install the watercooler









*Status "Test System 6" aka "Beaufort" - 11.06.15 05:34am:*

Temperature watch - GTX 750ti d0, d1 and d2 - 5min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 10min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 15min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 30min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 120min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 240min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 6hours.:



Temperature watch - GTX 750ti d0, d1 and d2 - 12hours.:



Temperature watch - GTX 750ti d0, d1 and d2 - 24hours.:
Adding a 18 hours check because of the increased temperatures. But it seems to have been a couple of special jobs/workloads. The temperature is back down



Temperature watch - GTX 750ti d0, d1 and d2 - 24hours.:



Temperature watch - GTX 750ti d0, d1 and d2 - 48hours.:
_not there yet...._

Code:



Code:


# uptime
 11:39:54 up  6:05,  1 user,  load average: 6.10, 5.95, 5.78

Code:



Code:


# uptime
 17:53:54 up 12:19,  1 user,  load average: 4.52, 4.51, 4.53

Code:



Code:


# uptime
 23:25:49 up 17:51,  1 user,  load average: 5.12, 5.01, 5.01

Code:



Code:


# uptime
 05:19:32 up 23:45,  1 user,  load average: 6.01, 5.89, 5.75

_Ongoing test........._

It seems like one card still gets a little hot (according to my standards). I don't like 50+ C. !!! We still have 10% - 20% more load on the chassis fans. We can increase the chassis fans with about 700rpm, so this might help. I'll make a image to show you the very little room there is between these cards! I don't like it, which is why Mr. Tex and I are talking about watercooling. Those watercooling sinks or elements are very small.

BTW, the command to make CLI temp running on nvidia-smi is:

Code:



Code:


Command # nvidia-smi -a -l | grep Temp

WaterCooling for GTX750ti - Will it fit? Or will I get a fit? I'll write them and check if this block will match the GTX750ti. And, if it's possible to mount it horizontal instead of vertical!! I spotted a problem! There's no room for a hose on top of the GPU, thats for sure. But the image of the block seems to show that it's possible to mount it otherwise than vertical!?!? And if there's other stuff we need when connecting 4x GPU's side by side on this mobo.

How high is this block with the 4x SLI connector on top? There's no information about size under "specification" Really bad! Too bad actually. If size doesn't matter here, where would it!? Info:
WaterBlock
- Alphacool NexXxoS GPX - Nvidia Geforce GTX 750 Ti M01 - with Backplate - Black
- Model: AC-11146
SLI-Connector
- Alphacool GPX SLI Connector - Quad - Plexi
- Model: AC-12438


Here's the items we are going to order from over there, if it fits of course. There's more issues to solve still. (If the shop doesn't ship I'll try to find the same things in the UK or maybe in a national shop) Info:
WaterBlock
- EK-FC750 GTX (for GeForce GTX 750 (Ti) ) - Acetal
- Model: EK-FC750GTX-AC
http://www.performance-pcs.com/ek-fc750-gtx-for-geforce-gtx-750-ti-acetal.html



Quad SLI Connector
- EK-FC Terminal QUAD Semi-Parallel - Acetal
- Model: EK-FC-TERMINAL-QUAD-SEMI-PAR
http://www.performance-pcs.com/ek-fc-terminal-quad-semi-parallel-acetal.html


Radiator
- Swiftech MCR220-DRIVE REV3 Dual 120mm Radiator - NO PUMP
- Model: MCR220-DRIVE-B-R3
http://www.performance-pcs.com/radiators/hot-swiftech-mcr220-drive-rev3-dual-120mm-radiator-no-pump.html#Specifications


Please notice!! This is sadly to large (284 x 128 x 34mm). But only by 20mm. We've got room for 260x200x84 - The SSD disk can be placed on top of the DVD. If we need to, we can drop the SSD and the DVD completely and use the space. Then install a MM2 disk directly on the mobo and use USB-key for installation. Works fine!! But, I think we may need to try and find a radiator which fits to keep the costs down







Well, I haven't made the math yet, it may be cheaper using onboard disc and USB-key actually







But to keep it simple, to change as little from version 1 as possible.

WaterPump

*Our email/question to the shop:*

_Dear Sirs,

We need a little help regarding waterblocks for our GTX750ti (low profile/slim card from KFA2).

Which waterblocks fits this card? And, if it's the one mentioned below, can this be mounted horizontal instead of vertically?? The reason I ask, is because this is for a project with 4x SLI installed in a 2U chassis (84mm high). Here's a image showing the cards. There will be a 4'th when and if you've got a solution to our problem.
http://www.overclock.net/content/type/61/id/2485278/width/500/height/1000/flags/LL

We have seen your SLI Quad Connectors, but using one of those, the installation will be to high, over 84mm, right? Have you got another way to use such a connector?

Do you ship overseas? Or do we have to get a local to ship it for us?

We are looking so much forward to hear from you.

Kind Regards,
Dan Hansen
Denmark
_

*Testing GTX750ti from KFA2*
A unit done from Einstein shows this:

Code:



Code:


[07:25:55][14230][INFO ] Using CUDA device #0 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)
Report deadline 15 Jun 2015 20:54:27 UTC
Run time        15,970.85
CPU time        496.75
Validate state  Valid
Claimed credit  8.60
Granted credit  4,400.00
application version     Binary Radio Pulsar Search (Parkes PMPS XT) v1.52 (BRP6-cuda32-nv270)

According to my math, it took about 4.4 hours to make do this job for one GPU. This means more than a 1000credits per hour per GPU!! Let's just say 4000 credits an hour when running 4 GTX750ti GPU's times 24 = 96.000credits per day. And then of course the little bit the CPU does as well. Will this system be able to do 100.000credits for einstein/SETI*/Asteroid* per day?? If, my goal has been reached! All we need then is to fix the watercooling issue and setup the shellscripts and the cruncher will be done. I'm a little excited to see how it'll do over the next few days. When done testing, we'll compare to the 2 "version 1" systems using GT640 and see how it goes.


----------



## Finrond

Honestly, I'm surprised GPU temps are even that low considering how close they are in the chassis. If it weren't for the heatsinks blocking that 4th slot I would say job well done, leave it alone. For a temp comparison, I've been running my 670 at 75-80c for the last two years almost non-stop.


----------



## Tex1954

Well, lot of questions to answer... You didn't give dimensions for the "radiator" area, so now idea about what will fit there...

Also, I still wonder why you wish to water cool... 50C is perfectly okay for any GPU temp... MY GTX560 Ti cards run 63c water cooled full load. Don't expect miracles with water cooling. The major advantages are quieter operation and more real cooling power for higher powered cards. Your GTX750 cards may run the same temps water cooled, hard to tell, but usually water cooling will let the GPU's run the same temp as if the original fan was at 100% with best air flow. Obviously, multi-GPU setups gain advantage with water cooling due to air flow restrictions with multiple cards.

In any case, those distribution blocks vary according the design from around 5/8" to maybe 3/4" above the card edge.....I like a different method myself... I like the pipe-connectors...


----------



## magic8192

I am going to run low profile 750ti cards in my new 2u server. I really like your setup!


----------



## DanHansenDK

Hi Finrond









Right.. But, it can be another temp. reading you got. Is it a reading from the mobo, is it a reading from the GPU/card itself? Is it an analog reading of a sensor mounted on the GPU? These temp. readings vary a lot. I had a ASUS GT640 die of running 100% in a month at around 60 degrees Celsius. And that card was suppose do endure 95 degrees.









If it wasn't for the 4'th card I would keep it air cooled. But as you say, there's no room for it and its impossible to find another 500watts 2U PSU. It's already small







I'm a little nervous about the thought of bringing water into my electronics, so it would have been nice to keep it air cooled







Let's see what happens, maybe there's not room for this cooling system!?

Temp. seems to rise here passed 12hours!? It could be jobs which rises the temperatures. 63 degrees on card d0 and 57 on card d2 which is strange because the one in the middle is the one that feels the hottest. I can be because of the OS installation that those socket values has been jammed, but I think not. Only the card in the middle should be hotter than the other two, because it has got a card on either side.

Hi Tex









I measured the space which we could free in 2 minutes:

_Please notice!! This is sadly to large (284 x 128 x 34mm). But only by 20mm. We've got room for *260x200x84* - The SSD disk can be placed on top of the DVD. If we need to, we can drop the SSD and the DVD completely and use the space. Then install a MM2 disk directly on the mobo and use USB-key for installation. Works fine!!_

_Also, I still wonder why you wish to water cool... 50C is perfectly okay for any GPU temp... MY GTX560 Ti cards run 63c water cooled full load. Don't expect miracles with water cooling._
Well, as you can see in the image "beneath this" there's not room for this 4'th card and a waterblock would make it possible for me to install the 4'th GPU







That's why. Sorry, I haven't let you know. Sometimes I think you guys are mind readers apparently











_In any case, those distribution blocks vary according the design from around 5/8" to maybe 3/4" above the card edge.....I like a different method myself... I like the pipe-connectors..._
Nice picture man!! Looks great.. This is what I need, I need to see the ways it can be done to find a solution. I wrote the shop you told me aout and asked about these thing....
_
working on this ........_

Hi Magic









_I am going to run low profile 750ti cards in my new 2u server. I really like your setup!_
Well, tat just warms a Viking the f.... up all the way up here in the cold and freezy north








Is this system for crunching numbers or is it a game-system?? I'm guessing crunching









Remember to check RPM of the fans. Well, if you've only use 1 GPU, then I guess it'll be enough with the standard fans. Which 2U case are you using?

48hours burn-in test in progress --> here's the new results: http://www.overclock.net/t/1467918/project-headless-linux-cli-multiple-gpu-boinc-server-ubuntu-server-12-04-4-14-04-1-64bit-using-gpus-from-geforce-gt610-640-to-crunch-data/220#post_24022477

.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _I am going to run low profile 750ti cards in my new 2u server. I really like your setup!_
> Well, tat just warms a Viking the f.... up all the way up here in the cold and freezy north
> 
> 
> 
> 
> 
> 
> 
> 
> Is this system for crunching numbers or is it a game-system?? I'm guessing crunching
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Remember to check RPM of the fans. Well, if you've only use 1 GPU, then I guess it'll be enough with the standard fans. Which 2U case are you using?


Got a Coraid 2u server from Ebay. Here are some details and pics. Looks like I can easily add 3 750ti cards, but I might be able to squeeze in 4. It will be a 24/7 cruncher.


----------



## DanHansenDK

Hi Magic,

This is a real server! This is a server which easy can be your best system. Here the flow of air is calculated, not like my stuff. That one look really great. And it has got a 2U PSU. Is it possible for you to "box" the fan areas (make a mark where the fans are)?

If your are going to use the same card as me you might just be able to place 4, because of this real server PSU unit!! This is great stuff!! And if you are going to stick with the aircooling, then I say, use the special 5 SLI socket mobo which I showed earlier on. I will find it again and show it to you. Then there will be more space between the 3 GPU's. It's just that 4'th GPU or d3 in my case, there has to be room for it or else this 5 SLI socket mobo is of no use. Can you measure your space for me, and I'll make some calculations?? From the red bar to the other red bar.

I will check if there's room for the 4'th if you are going with the 4 SLI mobo or the 3'rd if you are going to use the 5 SLI special mobo for more room between the GPU's









Well. here's where I guess the main fans are placed. Where else is they located, if anywhere else


----------



## magic8192

There are 4 fans in the front of the case
There are 7 x8 full size x16 PCI express 2.0 slots in this board.


----------



## magic8192

The measurement from red bar to red bar is 1.25 in or 3.175 cm


----------



## DanHansenDK

Hi Magic,

Regarding those 7x pci, the space between SLI's are different. And you are going to install a new mobo of course, but you know that already









_The measurement from red bar to red bar is 1.25 in or 3.175 cm_
The card measures 3.75-3.80cm from the SLI slot to the widest point of the heatsink. It's a rather large heatsink. According to my calculation, you will not be able to fit 4 of those nor 3 using the 5x SLI special board. But, you will be able to fit 3 of them like I've done it right now.
Oops... Not so fast!! This is the measurement of a PCI socket to the edge of the chassis!! I'll have to check with the design, if the SLI socket is placed different. And I think it might be!! I'll just have to compare 2 standard ATX mobo's like this OC Formula model is. SOme og those mobo's which has 4x SLI is larger than standard ATX. I'll have to get back to you on this matter







There's hope yet







Anyway, if it doesn't fit, you can just install 1x GT640 which does about the half of the GTX750ti









BTW, installing a new mobo will give you free space in the left front area!! The mobo in that server is the same size as Long Island, so there's space to be saved here








MacigSuperCruncher to be...


But, as you can see, I've had to open the hood of "Beaufort" to lower the temperature!! It got really hot, much hotter than I'm going to accept. This card is fantastic, but the heatsink is wrongly designed. It should have been longer and not so fat









*Status Test System 6 aka "Beaufort"*

Code:



Code:


# uptime
 09:33:27 up 1 day,  3:59,  1 user,  load average: 4.59, 4.85, 5.09

Temperature watch - GTX 750ti d0, d1 and d2 - 28hours.:
Temperatures is rising here. I had to "pop the hood" and give it air. I'll try to give the fans full throttle to see what happens


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> Regarding those 7x pci, the space between SLI's are different. And you are going to install a new mobo of course, but you know that already


I plan on using this motherboard if everything is working.
Quote:


> _The measurement from red bar to red bar is 1.25 in or 3.175 cm_
> The card measures 3.75-3.80cm from the SLI slot to the widest point of the heatsink. It's a rather large heatsink. According to my calculation, you will not be able to fit 4 of those nor 3 using the 5x SLI special board. But, you will be able to fit 3 of them like I've done it right now.
> Oops... Not so fast!! This is the measurement of a PCI socket to the edge of the chassis!! I'll have to check with the design, if the SLI socket is placed different. And I think it might be!! I'll just have to compare 2 standard ATX mobo's like this OC Formula model is. SOme og those mobo's which has 4x SLI is larger than standard ATX. I'll have to get back to you on this matter
> 
> 
> 
> 
> 
> 
> 
> There's hope yet
> 
> 
> 
> 
> 
> 
> 
> Anyway, if it doesn't fit, you can just install 1x GT640 which does about the half of the GTX750ti


I think only 3 will fit in the case.








Quote:


> Spoiler: temp testing etc.
> 
> 
> 
> BTW, installing a new mobo will give you free space in the left front area!! The mobo in that server is the same size as Long Island, so there's space to be saved here
> 
> 
> 
> 
> 
> 
> 
> 
> MacigSuperCruncher to be...
> 
> 
> But, as you can see, I've had to open the hood of "Beaufort" to lower the temperature!! It got really hot, much hotter than I'm going to accept. This card is fantastic, but the heatsink is wrongly designed. It should have been longer and not so fat
> 
> 
> 
> 
> 
> 
> 
> 
> 
> *Status Test System 6 aka "Beaufort"*
> 
> Code:
> 
> 
> 
> Code:
> 
> 
> # uptime
> 09:33:27 up 1 day,  3:59,  1 user,  load average: 4.59, 4.85, 5.09
> 
> Temperature watch - GTX 750ti d0, d1 and d2 - 28hours.:
> Temperatures is rising here. I had to "pop the hood" and give it air. I'll try to give the fans full throttle to see what happens


----------



## DanHansenDK

Hi Magic,

_I plan on using this motherboard if everything is working._
But there's only PCI sockets. Do you plan to convert the PCI slots?? Actually there's a little company making converters not far from me, but I'll guess that it is cheaper to buy a new mobo. Much cheaper. This Z97 OC Formula is not that expensive








BTW, the slot is placed in the same place as the SLI socket, because it has to line up with the back of the chassis. The 7 mounting brackets in the back







I was very tired when I wrote you last time









*Status Test System 6 aka "Beaufort"*
I'm going to change the configuration of the fans, so that it'll run at 100% all the time.. Only if the CPU gets colder than say 40 degrees Celsius, then it can throttle down. We need the air here







After that I'll reinstall the whole cha-bang - testing the new CUDA version at the same time. We need to be sure it works.

48hour burn-in test is over and there were a few hiccups!? Card d0 got too d... hot, thats for sure. We'll test it after increasing fans with about 700RPM.

Code:



Code:


# uptime
 04:10:01 up 1 day, 22:35,  1 user,  load average: 5.11, 5.08, 4.92



*Status Test System 6 aka "Beaufort"*
Fan settings has been altered in BIOS and looks like this now:

GPU/Chassis Fan BIOS Config - CPU Sensor Control! (Not Mobo Sensor Control)

Code:



Code:


30° --> 80%
40° --> 100%
60° --> 100%
70° --> 100%
80° --> Critical

This means that is the CPU temperature is below 40, 39 or less, then chassis fans (which points directly at the GPU's) will throttle down to 80%. Even though we are using a industrial 2U fan on top of the CPU, the temperature is still higher than 40 when it runs at 100%. I've been measuring this for a while now, and an i5-4690K running at 100% using this industrial 2U CPU cooler lies in the area of 43-56 degrees if I'm remembering it correctly. This way the system throttles down in case of no workloads/jobs. No need for the system to race ahead if there's no need for it









Reinstalling OS
BTW, please let me know if any of you are going to "switch side's"







If you are going to make a Linux Installation, then please let me know. And I'll insert the newest ToDo of mine!! So that you are not going to use your time trying to solve issues I've already solved. The only reason I haven't made it here yet (in a complete version) is because of the unfinished shellscripts!!

_ongoing test......_

.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> _I plan on using this motherboard if everything is working._
> But there's only PCI sockets. Do you plan to convert the PCI slots?? Actually there's a little company making converters not far from me, but I'll guess that it is heaper to buy a new mobo. Much cheaper. This Z97 OC Formula is not that expensive
> 
> 
> 
> 
> 
> 
> 
> 
> BTW, the slot is placed in the same place as the SLI socket, because it has to line up with the back of the chassis. The 7 mounting brackets in the back
> 
> 
> 
> 
> 
> 
> 
> I was very tired when I wrote you last time


They are PCI express X16 slots. The 750ti will drop right in.


----------



## DanHansenDK

Hi Magic,

Sorry man!! It looked like standard PCI sockets and not PCI-e







That's just great!! No problems at all









I'm reinstalling the OS now. It's a little late... sorry, early







But I have to finish this installation so that we can test it with the new BIOS setup!! I've changed the RPM to 100% on the PAPST Fans - so I'm very excited to see what happens. But, it might not help much.. Let's see what happens







Looking forward to see the "MagicSuperCruncher version 1"

Hi Tex,
_
In any case, those distribution blocks vary according the design from around 5/8" to maybe 3/4" above the card edge.....I like a different method myself... I like the pipe-connectors..._
Those pipe-connectors might make it possible for me. I thought you had to use a SLI connector bridge to assemble more than one card. If I can rotate the waterblock and make the hose connectors stick out in the back of the card, it just might work







If there's any kind of info you can give me, please don't hold back


----------



## DanHansenDK

*Status Test System 6 aka "Beaufort"*
I'm going to change the configuration of the fans, so that it'll run at 100% all the time.. Only if the CPU gets colder than say 40 degrees Celsius, then it can throttle down. We need the air here







After that I'll reinstall the whole cha-bang - testing the new CUDA version at the same time. We need to be sure it works.

48hour burn-in test is over and there were a few hiccups!? Card d0 got too d... hot, thats for sure. We'll test it after increasing fans with about 700RPM.

Code:



Code:


# uptime
 04:10:01 up 1 day, 22:35,  1 user,  load average: 5.11, 5.08, 4.92



*Status Test System 6 aka "Beaufort"*
Fan settings has been altered in BIOS and looks like this now:

GPU/Chassis Fan BIOS Config - CPU Sensor Control! (Not Mobo Sensor Control)

Code:



Code:


30° --> 80%
40° --> 100%
60° --> 100%
70° --> 100%
80° --> Critical

This means that is the CPU temperature is below 40, 39 or less, then chassis fans (which points directly at the GPU's) will throttle down to 80%. Even though we are using a industrial 2U fan on top of the CPU, the temperature is still higher than 40 when it runs at 100%. I've been measuring this for a while now, and an i5-4690K running at 100% using this industrial 2U CPU cooler lies in the area of 43-56 degrees if I'm remembering it correctly. This way the system throttles down in case of no workloads/jobs. No need for the system to race ahead if there's no need for it









Reinstalling OS
BTW, please let me know if any of you are going to "switch side's"







If you are going to make a Linux Installation, then please let me know. And I'll insert the newest ToDo of mine!! So that you are not going to use your time trying to solve issues I've already solved. The only reason I haven't made it here yet (in a complete version) is because of the unfinished shellscripts!!

NEWS!! Ubuntu Server 14.04.2 is out! This means that we are going to test both CUDA7 and Server 14.04.2 running these GPU's (GTX750ti).

*Status Test System 6 aka "Beaufort" - 13.06.15 08:11am*
I have tested both the "old" Distro Ubuntu Server 14.04.1 and 14.04.2. Both have got a problem after the installation regarding the screendriver unity. Using a HDMI connector now and not VGA like before, an issue has occurred. After installation of both versions, the screen halts! But, don't panic guys, it just doesn't show the login screen. I've tried other TTY's but without luck. Usually I'm SSH'ing remotely using PuTTY and to do that, I need a IP. Since it a clean installation, we are using "DHCP" instead of "FIXED IP". So we need the IP. Unless you know another way, you can find the IP by checking your router. Most routers show clients name, IP and MAC-address. I solved the problem this way and was able to SSH using PuTTY and the IP this way.
I'm guessing there will be a package update in the near future and since this card is pretty new the issue is not such a surprise to me











Step1. OK, following my ToDo and making the basic setup and preparing the server for headless rendering...
Step.2 Installing CUDA7.0 - it looks good this far...



*Status Test System 6 aka "Beaufort" - 13.06.15 10:28am*
OK, Ubuntu Server 14.04.2 & CUDA7.0 works just fine!! OS reinstalled and I've followed the ToDo to the point!! Everything works beautifully








But, there's still the problem of a hot (in my opinion) GPU (d0) and again it's a Asteroid job that makes it so. Or is it?? 3 times I have noticed hot GPU's and everytime it was a Asteroid job. Let's see if it can hold. According to guys much brighter than myself, it should be OK. So let's just see what happens. I'll issue a re-test here.



*Status "Test System 6" aka "Beaufort" - 11.06.15 10:37am:*
48hour burn-in test with new BIOS setup and new OS installation. Chassis Fans x4 RPM +700

Temperature watch - GTX 750ti d0, d1 and d2 - 5min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 10min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 15min.:
Oops.... Haven't unplugged the HDMI connector, which means the GPU har a larger load than the other. It's not much, but it is something!! Unplugging and then let's see what happens...

Temperature watch - GTX 750ti d0, d1 and d2 - 5min.:
Without the load/HDMI unplugged. Seems to be the same...



Temperature watch - GTX 750ti d0, d1 and d2 - 10min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 15min.:



Temperature watch - GTX 750ti d0, d1 and d2 - 48hours:
Conclusion! Card1 (d0) heats up because of the lack of space between the cards. The secondary cooling the x4 Papst industrial chassis fans isn't able to press enough air in between the ribbets of the heatsink. This is a problem. But not a big problem. The main reason for my decision of using watercooling is, that there's no room for a 4'th GTX750ti in slot d3 !!! I hope we'll be able to solve this issue









Code:



Code:


# uptime
 10:03:57 up 1 day, 23:18,  1 user,  load average: 4.64, 4.53, 4.54



Please notice!! While waiting for the next few dollars to drop from the sky, I'll finish those shell-scripts. That way we can log the results instead of sitting here doing it all manually. And, the system whil just shutdown if the GPU or the CPU gets hotter than we like. A combination of a shell-script and crontab will solve this beautifully. The script works, but I have to make it run on multiple GPU's and multiple CPU cores... The software packages & smtp client is ready. I posted that inhere long time ago. The reason why I haven't finished those scripts yet, is that I've been building and setting up a Developer System for Apache, PHP, SQL, shell-scripts etc. and that system isn't done yet. But anyway, let's see if we can get them done









.


----------



## Tex1954

Dude, I still see nothing wrong with your temps.... Sure, we all like things to run cooler, but these setups are MADE to run in the 70c-80c range just fine.

Here are two GTX 580 cards cranked up to 823Mhz running with fans at 90%....



They run pretty warm.... and can do that for years with no problems... The fans are not good enough to run them much higher clock speed, that is when water cooling comes in. I expect these boards to do near 950MHz on water eventually...


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> BTW, please let me know if any of you are going to "switch side's"
> 
> 
> 
> 
> 
> 
> 
> If you are going to make a Linux Installation, then please let me know. And I'll insert the newest ToDo of mine!! So that you are not going to use your time trying to solve issues I've already solved. The only reason I haven't made it here yet (in a complete version) is because of the unfinished shellscripts!!


I am running linux for my current project and would appreciate using the work you have completed. Thanks


----------



## DanHansenDK

Hello friends









Mr. Tex

_"...Here are two GTX 580 cards cranked up to 823Mhz running with fans at 90%...."_

Well, have you thought about a secondary source of air-supply like my chassis fans pointing directly at the GPU's??? I've been fiddling around with the GPU fan as well, and have had nothing but trouble! That's why I concentrated on the chassis fans and changed those. Which case are you using? Couldn't you add a couple of larger chassis fans or have did that already? I'm guessing you have









Regarding my GTX750ti x4 GPU problem! It's not so much the temperatures as the problem with the 4'th card. I can't install that 4'th card if I don't make the heatsink smaller. That's why I thought of the watercooling. It would make the temperature on card1 (d0) lie in the area of the others too, so that's another reason. It's just not the primary reason for me to change it. That's because of the lack of space









Does your 2 GPU's really run that hot? They have been running up to 96 degrees and 98 degrees. Celsius right!?

If I have had room for the 4'thd card, then I would have kept the aircooling, for sure! But it's the only solution. But I'm actually not even sure about this watercooling, because I haven't heard from the webshop we talked about. I only asked if it was possible to turn the block 90degrees and/or if they had a block that fitted the GTX750ti. I'm pretty sure it impossible two install watercooling in a 2U chassis actually. Let's see what they say. I will write a danish firm tha fiddles with cooling as well. This is the place I've been getting my fans from. Maybe they have a solution









Thanks Tex!!









Hi Magic,

_"...I am running linux for my current project and would appreciate using the work you have completed.."_

Allrighty-then.... Now we are getting somewhere.... I'm fiddling with the watchdog* shell-script . To run on both lm-sensors CPU rutine and the nvidia-smi







Getting there...
I can show the complete and working part of a headless server running CUDA70. The part where I setup the server for static IP and install LM-sensors, MSMTP for mail warnings etc. can wait till it's finished right???
I don't now which distro you are using, but this will work on Ubuntu Server 14.04.1 and 14.04.2 - I will run on Debian too.

Here's my hard work!! It doesn't look like much, but believe me......









Code:



Code:


HEADLESS LINUX CLI MULTIPLE GPU BOINC SERVER - RACK-MOUNTED HEADLESS BOINC SUPER CRUNCHER
OS: Ubuntu Server 14.04.2 64Bit
KERNEL: 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
CUDA: CUDA7.0
NVIDIA: v.346.46
BOINC: v.7.2.42
TODO: v.1.1.12 15.06.15 10:42:00
AUTHOR: DanHansen[at]Denmark

[...]

Install Ubuntu Server 14.04.x

To get GPU and headless rendering we need the drivers from the CUDA package. Download the 64-bit debian package for Ubuntu 14.04 from nvidia.com this way. Install linux headers, basic graphical packages and add them to the package manager. Please notice that CUDA may have issued a new version. If, just change the filename accordingly. Follow these steps:
Command: # apt-get install build-essential linux-headers-`uname -r`
Command: # wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.0-28_amd64.deb
Command: # dpkg -i cuda-repo-ubuntu1404_7.0-28_amd64.deb
Command: # apt-get update
Command: # apt-get install cuda-7-0

As part of the CUDA environment, we need to add the following in the ".bashrc" file of your home folder: ***
Command: # export CUDA_HOME=/usr/local/cuda-7.0
Command: # export LD_LIBRARY_PATH=${CUDA_HOME}/lib64
Command: # PATH=${CUDA_HOME}/bin:${PATH}
Command: # export PATH

To be able to do the next steps, we need to restart the system:
Command: # reboot

Install "linux-image-extra" and x11-xserver-utils. Needed by the nvidia module and we need the X tools later on:
Command: # apt-get install linux-image-extra-$(uname -r) x11-xserver-utils mesa-utils

Once installed we need to load the nvidia module. Execute the following command:
Command: # modprobe nvidia

Enable headless mode this way: 
Info: A warning is issued, but it makes a file anyway! /etc/x11/xorg.conf is created and we need this file!
Command: # nvidia-xconfig --enable-all-gpus

The default configuration will create XF86Config, but we need xorg.conf so if xorg.conf is not present, then do this: 
Info: Only needed if you are running Ubuntu Server 12.04.x - Not if you are running Ubuntu Server 14.04.x
Command: # cp /etc/X11/XF86Config /etc/X11/xorg.conf

Install boinc-client etc.

[...]

Now yank out the cables and start crunching









Checking GTX750ti performance:

Code:



Code:


[...]
[19:03:20][3169][INFO ] Using CUDA device #2 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)
[19:03:20][3169][INFO ] Version of installed CUDA driver: 7000
[...]

Code:



Code:


<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
[19:03:20][3169][INFO ] Application startup - thank you for supporting [email protected]!
[19:03:20][3169][INFO ] Starting data processing...
[19:03:20][3169][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 34 MB (2015 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[19:03:20][3169][INFO ] Using CUDA device #2 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)
[19:03:20][3169][INFO ] Version of installed CUDA driver: 7000
[19:03:20][3169][INFO ] Version of CUDA driver API used: 3020
[19:03:20][3169][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[19:03:20][3169][INFO ] Header contents:
------> Original WAPP file: ./PM0032_01351_DM68.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 50834.943028076457
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 151700.8212
------> DEC (J2000): -571815.552
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4835501
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 68 cm^-3 pc
------> Scale factor: 1.30435
[19:03:20][3169][INFO ] Seed for random number generator is 1086604356.
[19:03:21][3169][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-08
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[19:03:21][3169][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 170 MB (1879 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 136 MB
[...]
[21:45:07][3169][INFO ] Statistics: count dirty SumSpec pages 16883 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1886937
[21:45:07][3169][INFO ] Data processing finished successfully!
[21:45:07][3169][INFO ] Starting data processing...
[21:45:07][3169][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 34 MB (2015 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[21:45:07][3169][INFO ] Using CUDA device #2 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)
[21:45:07][3169][INFO ] Version of installed CUDA driver: 7000
[21:45:07][3169][INFO ] Version of CUDA driver API used: 3020
[21:45:07][3169][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:45:07][3169][INFO ] Header contents:
------> Original WAPP file: ./PM0032_01351_DM70.00
------> Sample time in microseconds: 1000
------> Observation time in seconds: 2097.152
------> Time stamp (MJD): 50834.943028034686
------> Number of samples/record: 0
------> Center freq in MHz: 1231.5
------> Channel band in MHz: 3
------> Number of channels/record: 96
------> Nifs: 1
------> RA (J2000): 151700.8212
------> DEC (J2000): -571815.552
------> Galactic l: 0
------> Galactic b: 0
------> Name: G4835501
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 2097152
------> Trial dispersion measure: 70 cm^-3 pc
------> Scale factor: 1.30435
[21:45:07][3169][INFO ] Seed for random number generator is 1086604356.
[21:45:08][3169][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-08
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[21:45:08][3169][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 170 MB (1879 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 136 MB
[...]
[00:23:48][3169][INFO ] Statistics: count dirty SumSpec pages 12350 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1886937
[00:23:48][3169][INFO ] Data processing finished successfully!
00:23:48 (3169): called boinc_finish(0)
</stderr_txt>
]]>

Hi Tex,

Do you believe these numbers?? The low percentage of the graphic card fans?

Code:



Code:


+------------------------------------------------------+
| NVIDIA-SMI 346.46     Driver Version: 346.46         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 750 Ti  Off  | 0000:01:00.0     N/A |                  N/A |
| 46%   62C    P0    N/A /  N/A |    198MiB /  2047MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 750     Off  | 0000:02:00.0     N/A |                  N/A |
| 41%   49C    P0    N/A /  N/A |    162MiB /  2047MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 750     Off  | 0000:03:00.0     N/A |                  N/A |
| 44%   55C    P0    N/A /  N/A |    162MiB /  2047MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

.


----------



## DanHansenDK

*Mail from Performance PCS:*

_Hi

We only have one block for only reference 750 from EK and so please check their tool:
http://www.coolingconfigurator.com/

Yes we ship worldwide.
Thanks

Best Regards,
Customer Service_

*Our response:*

Dear Sirs,

_Thanks for your mail!

I'm sorry to say that the link/service to identify hardware you send us doesn't work. The service works fine, just not for newish hardware like ASRock OC Formula Z97 x4 SLI and KFA2 GTX750ti Low Profile !?!?

We were not even able to find the mobo which has been around for more than a year. We tried to select from a list, but theres a gab between the GTX680 and 760. The list is incomplete apparently. Please help us solve the problem by identifying a product which can be used. Maybe you could try the vendor? This is a project running on year 3, so I hope you'll try and find a peace of hardware that fits for us







If, we'll certainly order from you!!
If you are able to find the waterblock, then please check if it's possible to turn the block 90 degrees so that the hose connectors will be at the end of the card instead of upright. There's no room for connectors nor hoses in a 2U industrial chassis









Looking forward to hear from you ASAP







_

It's not looking good. They ship overseas... yeahhh.... But there's not much help here









*Mail from Performance PCS:*

_We are sorry to say that the link is it. No one else makes anything for the hardware you have. Sorry._

Not much help here, ai??? Not even a Yes/No to the question regarding the product Mr. Tex found. A "Yes" or a "No" would have helped.


----------



## DanHansenDK

Hi Magic,

This ToDo has been tested 2 days ago and it's up to date!! Everything works like it's suppose to!

I've finished the WatchdogCpuTemp.sh shell-script!! I had a lot of trouble getting it to work with the MSMTP client. MSMTP works along with sendmail after installation, but to be able to retrieve "message files" from the script I'm calling the app directly!! It took about 10hours to make it work! Not to send the mail and to make it work with gmail, that part took 10 sec. - but to be able to make it work with a "message file" containing multiple receivers, with subject and mailbody. That was a G.. d... horror-show.

This, my friends, menas that we now can protect these systems. Not using a analog temperature detector which makes an alarm or maybe mails a warning. No, we are able to set the limits for CPU heat. Both a limit where we are warned and a limit for shutting the system down to protect it.
The script will of course be running using CRON & the crontab function. The mailaccounts will be placed in the MSMTP config-file "/etc/msmtprc". The messages for the system will be in a textfile. You can use the same textfile or use 2 like me, 1 for each kind of problem. I've chosen to log into the same logfile, but in the script I've comment'ed out a little infotext showing how to log into 2 different logfiles. Doing it this way, makes it easy to change the email warnings. It's easy because you are able to edit the textfile and not the sehllscript. If your mailaccount changes or you are getting a new one (to mail through) the just edit your config-file "/etc/msmtprc". It'll all be part of the complete ToDo, but to help Mr. Magic to launch this system design, I'll show how to do it.

Now there's the WatchdogGpuTemp.sh script to solve







It shouldn't be that big a problem now







It's the most important one in this case anyway, so I better get to work









_
ongoing work....._

.


----------



## magic8192

Thanks for the work. I am going to work on the new boincer this weekend. Going to clone the hard drive in the 4P box and upgrade it to a SSD. Going to use the old drive from the 4P box in the new boincer.


----------



## DanHansenDK

Hi Magic,

_"...Going to clone the hard drive in the 4P box and upgrade it to a SSD.."_
4P Box is?? Sorry, it's a word I don't know








Good idea with SSD, I'm using SSD for all my systems! It's really great along with Linux. Back when I used old Pentium4 & Ubuntu 12.04 I chose to use SSD and it worked very nicely. Actually my webserver & nameserver are still using 12.04 and Pentium4 with a SSD disc. It's the only system which has not been updated yet









I managed to finish the CPU Temperature script and made it so that you can use one or two different logfiles and one or two system message files. It run and it works. Do you need it for your system? If, I have got a ToDo showing how to setup MSMTP, how to setup the config-file ad making mail work - gmail included. I've got a ToDo showing how to setup the textfiles/system messages and samples of content. The script itself has samples of setup and config as well. The GPU Temparature scripts is in the making







This is what I'm fiddling with right now. It's a little tricky, but I think I can manage to get it to work before your weekend project starts. Do you know how to use and config CRON? It's just that I would love not to have to make that part of the complete ToDo









What this first shell-script does:
"WatchdogCpuTemp" Checks core temperatures of each CPU core. Two temperature limits are set. the first as a warning limit. The second as a critical limit. If the first limit is exceeded the script log it and mails a warning. If the second limit is exceeded the script logs it, mails a critical alert and shuts the system down.

Sample of "WatchdogCpuTemp Warning Mail" due to CPU core temperatures higher than 50 degrees Celsius:



Sample of "WatchdogCpuTemp Warning Log Entry" due to CPU core temperatures higher than 50 degrees Celsius:



It's the same for the critical limit! Just another text in the warning mail and the logfile, and the of course the system has been shutdown









..


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> _"...Going to clone the hard drive in the 4P box and upgrade it to a SSD.."_
> 4P Box is?? Sorry, it's a word I don't know
> 
> 
> 
> 
> 
> 
> 
> 
> Good idea with SSD, I'm using SSD for all my systems! It's really great along with Linux. Back when I used old Pentium4 & Ubuntu 12.04 I chose to use SSD and it worked very nicely. Actually my webserver & nameserver are still using 12.04 and Pentium4 with a SSD disc. It's the only system which has not been updated yet
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Spoiler: The rest of the post
> 
> 
> 
> I managed to finish the CPU Temperature script and made it so that you can use one or two different logfiles and one or two system message files. It run and it works. Do you need it for your system? If, I have got a ToDo showing how to setup MSMTP, how to setup the config-file ad making mail work - gmail included. I've got a ToDo showing how to setup the textfiles/system messages and samples of content. The script itself has samples of setup and config as well. The GPU Temparature scripts is in the making
> 
> 
> 
> 
> 
> 
> 
> This is what I'm fiddling with right now. It's a little tricky, but I think I can manage to get it to work before your weekend project starts. Do you know how to use and config CRON? It's just that I would love not to have to make that part of the complete ToDo
> 
> 
> 
> 
> 
> 
> 
> 
> 
> What this first shell-script does:
> "WatchdogCpuTemp" Checks core temperatures of each CPU core. Two temperature limits are set. the first as a warning limit. The second as a critical limit. If the first limit is exceeded the script log it and mails a warning. If the second limit is exceeded the script logs it, mails a critical alert and shuts the system down.
> 
> Sample of "WatchdogCpuTemp Warning Mail" due to CPU core temperatures higher than 50 degrees Celsius:
> 
> 
> 
> Sample of "WatchdogCpuTemp Warning Log Entry" due to CPU core temperatures higher than 50 degrees Celsius:
> 
> 
> 
> It's the same for the critical limit! Just another text in the warning mail and the logfile, and the of course the system has been shutdown
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> .


Here is the 4P box . I am familiar with cron. Going to run a headless linux system with some of your work for temp monitoring and getting all the cards working in linux. I also plan to setup and use boinctasks if possible.


----------



## DanHansenDK

Hi Magic,

_Here is the 4P box_

Wonderful









_Going to run a headless linux system with some of your work for temp monitoring and getting all the cards working in linux._

Yes, great... But is it not Nvidia newish GT*/GTX* GPU's you are going to use? All of them?? If, this ToDo will work!

Do you think you are going to use the script for protection of CPU & GPU's ?? I'm in the process of making the script for the GPU's... It's a little different because its not "lm-sensors" we are using, it's "nvidia-smi" . But the script for CPU with samples of mail warning text messages is complete! The ToDo-part as well








The script works and I guarantee that you will be notified if the temperature exceed limit1 and you will be notified & system will shutdown if limit2 i exceeded. I need to polish the script a little in the future. There's no need for it to send a email for each of the cores that are hotter than limit1. And there's a little with the last string telling which cores are OK. It's echo'ed to the screen so we don't see it, but I like things to be in order as you might have noticed








But it's important to protect the server, so I'll continue with the GPU version of the script. I'll be activating the script from this moment on my own systems. The script has got a version text so that you can always see when a newer version is ready







I'll keep you up to date









BTW!! I've made a stupid beginner error!! But I found the reason for the increased temperature on GPU1 (d0). As you know, this one card ran hotter than the other two a few times. Here's the reason:

Code:



Code:


[08:21:13][3421][INFO ] Using CUDA device #0 "GeForce GTX 750 Ti" (640 CUDA cores / 1472.00 GFLOPS)
[09:42:24][3479][INFO ] Using CUDA device #1 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)
[19:03:20][3169][INFO ] Using CUDA device #2 "GeForce GTX 750" (512 CUDA cores / 1177.60 GFLOPS)

I bought the wrong card the first two times! Apparently there was the "ti" version which has 640 CUDA-cores and I got it all wrong. It was the "ti" version I aimed for, but i made an error apparently. Now I can understand why the price was higher when I bought the last card. That's just too sad. I would have liked to have the "ti" version only.
But, there's a positive side to it, right?? Now we are able to test both types of the card and we already know that it run a bit hotter than the other type. SO I'll order the 4'th card and make it a "ti" version. Having made this mistake, you are now free from doing the same









OK, CRON has been set!! The script now runs 24/7/365! As you can see I've set warning limit to 65 degrees C and critical limit to 70 degrees C. I've set it to run every minute! It should be enough. Here's how it looks:



.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> _Here is the 4P box_
> 
> Wonderful
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _Going to run a headless linux system with some of your work for temp monitoring and getting all the cards working in linux._
> 
> Yes, great... But is it not Nvidia newish GT*/GTX* GPU's you are going to use? All of them?? If, this ToDo will work!
> 
> Do you think you are going to use the script for protection of CPU & GPU's ?? I'm in the process of making the script for the GPU's... It's a little different because its not "lm-sensors" we are using, it's "nvidia-smi" . But the script for CPU with samples of mail warning text messages is complete! The ToDo-part as well
> 
> 
> 
> 
> 
> 
> 
> 
> 
> .


Here is the card I am going to use. I only have one at the moment, but I plan on adding 2 more. I plan on using your scripts if I can. I installed the SSD and swapped the memory in the 4P rig last night. Now I have memory and a hard drive for the new boincer


----------



## DanHansenDK

Hi,

OK, This card "GIGABYTE GV-N75TOC-2GL G-SYNC Support GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 Video Card" is a different manufacturer, but the same GPU. And it's the model with 640 cuda cores. So it's all good. I can see that it's a 2-slot model, but actually the heatsink doesn't look that wide. Maybe it's less wide than on the card from KFA2 !?!?

I've been talking to my friend from a danish supplier of cooling equipment. They are currently trying to solve the issue, finding a waterblock that fits the KFA2 GTX750 ti OC SLIM Low Profile graphic-card. ! Actually I wasn't aware of the existence of your card. Thought that KFA2 was the only on who made a Low Profile edition for the time being. Do you know if there's a waterblock for the model you are using?

Hey!! I just saw that there's 2 1U PSU's in your MagicSuperCruncher !! Just a thought! If it's possible to get a single 1U PSU supplying the watts needed for "Test System 6 aka Beaufort" and your MagicSuperCruncher, then maybe we can solve this problem with missing space!! Take the 1U PSU and place it on the side. That would give us extra 4.4cm of space. No, not in that chassis you are using I can see. The PSU are fittet in the other side than in the chassis I'm using. And I'm not sure how wide the 1U PSU's are, if it's possible to install on the side. I'm not sure the 1U PSU's comes with that large a load either. Well, it was just a thought. We needed some space









"MagicSuperCruncher"


Magic, I've been studying your card!! It's longer, but more narrow!! I'm sure that if you remove the black plastic cover, that the heatsink itself is more narrow than the one on my KFA2 card!! If you've got the time, please measure this!! This could solve your problem sa well, and make room for 4 cards!!
I've been thinking on other solutions as well. To mount another heatsink. That's only an idea right now, but I've learned that it's possible to change the small fans on the graphic cards as well. Fitting a better fan helps, but how much?? Only problem is that I can't get these parts in Denmark. That's life in the middle-age and far north for you







Well, I'm not giving up! I know we can solve this one way or another!! It's just that I liked the idea of making version 2 of my cruncher watercooled







I just sounds right, right









.


----------



## magic8192

The card without the black shroud is 1.325 in/3.4925 cm. I only have 1.25 in/3.175 cm of space in the case. The other problem is that the card is a 2 slot card and there is only 1 slot available. The DVI connector on the card bumps into the back of the case because there is not a slot. The sad thing is that there is lots of wasted space on the other side of the case. I was wondering if I could fit this one in the case?


----------



## DanHansenDK

Hi Magic









_...I only have 1.25 in/3.175 cm of space in the case. The other problem is that the card is a 2 slot card and there is only 1 slot available. The DVI connector on the card bumps into the back of the case because there is not a slot. The sad thing is that there is lots of wasted space on the other side of the case. I was wondering if I could fit this one in the case?.._

OK, it's time for some math & scientific tests









First the data of my card. (and I love that KFA2 shows my card in profile WITHOUT THE HEATSINK!!):

SPECIFICATIONS:
GPU Engine Specs:
CUDA Cores 640
Base Clock (MHz) 1072
Boost Clock (MHz) 1150
Memory Specs:
Memory Speed 2700 (5400 effective)
Standard Memory Config 2048MB
Memory Interface Width 128-bit GDDR5
Memory Bandwidth (GB/sec) 86.4
Feature Support:
OpenGL 4.4
Bus Support PCI-E 3.0
Certified for Windows 8 Yes
Supported Technologies
SLI Options Not supported
Display Support:
Multi Monitor 3 displays
Maximum Digital Resolution 4096x2160
Maximum VGA Resolution 2560x1600
HDCP Yes
HDMI Yes
Standard Display Connectors One dual-link DVI-D, HDMI and VGA
Audio Input for HDMI N/A
Standard Graphics Card Dimensions:
Length 6.34 inches
Height 2.68 inches <--- YOU KNOW INCHES MAGIC!! OR ELSE CONVERT INTO CM.
Width 5.79 inches
Power Specs:
Maximum Graphics Card Power (W) 60W
Minimum System Power Requirement (W) 300W
Supplementary Power Connectors N/A
Model:
Product Code 75IGH8HX9KXZ
UPC Code 4895147114170

And here's the image. This isn't very important regarding your question, but I wanted to show it for comparison anyway.



OK, I found an image showing the thickness of the card. The co manufacturer Galax makes these cards. So this must be the same size and shape heatsink:



OK, let's first take a close look at the card you are talking about "MSI N750ti-2GD5TLP GeForce GTX 750 Ti 2GB 128-Bit GDDR5 PCI Express 3.0 x16"
Then let's take a look at the card you have already and let's look at heatsink in that image shall we? I've been turning and twisting some images to get the best view. You can see where the heatsink are at two points and then it's just to make a line and see where it ends. Let's try that. (This is not perfect I know, but not quite useless either):



As you can see looking at the red line, the heatsink seems to reach a point where we can make a mark. Let's call that mark "X" on this card. I've enlarged the image below, so that it shows easier:



OK, now let's take a look at it in profile and the compare it to the new card from MSI. I've made a point where the heatsink on the new card from MSI seems to end. I've called that point "Y". So now we compare the two points "X" on your existing card and "Y" on the new card from MSI. According to this little experiment there's no doubt that the card from MSI has got a much "thinner" heatsink! I think it's because they have made it longer and therefore is able to make a thin heatsink.



OK, now let's go back yo my card for a while. Because I've been thinking on alternatives to the watercooling. It seems that the manufacturers of videocards and watercooling parts doesn't plan alike







So while waiting for my friend to maybe solve the problem find that waterblock, I'll try to think of other ways to do it. To be able to get that 4'th card in. And to get some more space between the GPU's/graphic cards.
Yesterday I got an idea! What about using a hacksaw I said to myself!! A hacksaw must be the problem solver on this matter. Then I looked around and I found others who had done the same. Just look at these images















This is where I found the images and even though this is another type of system, the problem remains the same. Please don't make the heatsinks that FAT !!! Got it ???

_Hey, I'm working here....._

.


----------



## magic8192

It is odd, but the KFA2 card you have looks like a cross between the Gigabyte and MSI card. It has the slot setup of the MSI card with the heatsink of the gigabyte card. Very unusual. I ordered the MSI card to see if it fits in the last slot.


----------



## DanHansenDK

Hi Magic,

I'm guessing on 2.30 cm. !! 2/3 of your existing card!!

And as you can see my card is as wide as your existing. Then look at the image showing the profile of MSI:





.


----------



## magic8192

There is one other thing you can consider and that is a PCI-E riser cable. Something like this. It will allow you to move the video card to another spot. It should not be a big problem since you are going headless.


----------



## Tex1954

You guys, I was wondering why a right angle PCIe adapter and a full blown higher powered GPU couldn't be used.
Quote:


> Originally Posted by *magic8192*
> 
> There is one other thing you can consider and that is a PCI-E riser cable. Something like this. It will allow you to move the video card to another spot. It should not be a big problem since you are going headless.


I was thinking one could use a right-angle adapter and use two higher powered GPU's rather than four smaller powered cards.


----------



## DanHansenDK

Hi Guys,

Tex,

_I was thinking one could use a right-angle adapter and use two higher powered GPU's rather than four smaller powered cards._

That's actually not such a bad idea. I've been using those 90degrees connectors in 1U chassis!! Not such a bad idea at all!!!

Well, I'll have to take a brake... Be back soon


----------



## magic8192

Quote:


> Originally Posted by *Tex1954*
> 
> You guys, I was wondering why a right angle PCIe adapter and a full blown higher powered GPU couldn't be used.
> I was thinking one could use a right-angle adapter and use two higher powered GPU's rather than four smaller powered cards.


A pair of GTX 970 cards would nearly match the PPD of 4 750ti cards, but would consume more power than 4 750ti cards. With the airflow that I have in my 2u case, I may be able to remove the fans on the 750ti to improve efficiency even more. From DanHansenDK pics and measurements, I think the MSI card will fit in my case.


----------



## Finrond

This thread is awesome, I can't wait to see some pics of the epic cludges!


----------



## magic8192

I got Ubuntu server 14.04.2 up and running and boinc installed with it being remotely managed by BoincTasks.


The only problem I had was getting grub to install on the boot sector of the hard drive. It would install on the flash drive by default and I would have to reinsert the flash drive to get the computer to boot.







Figured out how to move grub to the boot drive and it is working good.


----------



## Tex1954

Wow! Great job Magic!!!!


----------



## magic8192

I managed to get the cuda drivers installed and the GPU crunching asteroids. managing the headless server remotely with boinctasks is very simple vs those crazy command line entries that you would have to do otherwise. The only problem with the DanHansenDK's install is that I had to put sudo in front of most of the commands to get them to execute. I do have some CPU temp issues that I will have to iron out.
CPU 1 runs much hotter than CPU 0.
coretemp-isa-0000
Adapter: ISA adapter
Core 0: +58.0°C (high = +80.0°C, crit = +96.0°C)
Core 1: +59.0°C (high = +80.0°C, crit = +96.0°C)
Core 2: +56.0°C (high = +80.0°C, crit = +96.0°C)
Core 8: +54.0°C (high = +80.0°C, crit = +96.0°C)
Core 9: +57.0°C (high = +80.0°C, crit = +96.0°C)
Core 10: +60.0°C (high = +80.0°C, crit = +96.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 0: +75.0°C (high = +80.0°C, crit = +96.0°C)
Core 1: +74.0°C (high = +80.0°C, crit = +96.0°C)
Core 2: +72.0°C (high = +80.0°C, crit = +96.0°C)
Core 8: +75.0°C (high = +80.0°C, crit = +96.0°C)
Core 9: +75.0°C (high = +80.0°C, crit = +96.0°C)
Core 10: +76.0°C (high = +80.0°C, crit = +96.0°C)

There is a piece of black plastic that covers CPU0 and CPU0 ram. It is not an optional piece. Without it, not enough airflow gets to CPU1 and CPU1 ram. The interesting thing is that the machine will automatically throttle the CPU if temps get too high and you get a red light on the front of the case and a command line warning in Linux. I would think that I would have probably got some kind of a popup if I was running a GUI. The above temps are with the black plastic installed.


----------



## Tex1954

Magic, don't see any pictures of your current new setup with CPU's installed etc...

So far as temps go, your CPU's should run fairly cool given the size of the heat sinks and air flow and such. You also didn't mention which CPU is which.

Not having enough info, I would offer a couple tests you must surely already be aware of.

1) Measure actual CPU voltages if possible using a multimeter directly on the Mobo. Make sure they are the same for both CPU's.

2) Swap CPU's and see if high temps follow the CPU or stay with the socket.

3) Use a hair dryer with heat OFF ( or remove coils on cheapo one) to blow air here and there to see what happens.

4) Use a laser temp gun to poke around like I do. Kinda like *this one*

5) Obviously, seating of the HS on the CPU will make a diff... as will TIM etc...

Other than that, running the exact same tasks, I don't think you should have such a large temperature difference...










PS: If it's plastic, you can add, subtract, mod to your hearts content...


----------



## magic8192

I will get some pics this evening. I already tried to reseat and put new tim on. I did that at the same time I put the black plastic shroud back on. I think it was the shroud that really helped. I am going to swap the CPUs this evening. I would guess that it is an airflow issue. I may try your blow dryer trick before swapping the CPUs.


----------



## magic8192

Quote:


> Originally Posted by *Tex1954*
> 
> Magic, don't see any pictures of your current new setup with CPU's installed etc...


Here are the pics of the CPUs installed and the 2 video cards.




Here is the clearance for the new MSI card in the last slot without the fan or shroud. I still have to figure out how to cool it.


----------



## Tex1954

Nice! Typical low-pro server setup. Looks to me as if you have room to put a couple "thin" fans on top of those heatsinks and not worry about the plastic piece.










80x80x15

120x120x12


----------



## magic8192

Quote:


> Originally Posted by *Tex1954*
> 
> Nice! Typical low-pro server setup. Looks to me as if you have room to put a couple "thin" fans on top of those heatsinks and not worry about the plastic piece.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 80x80x15
> 
> 120x120x12


I really like the layout of this 2u server. I really hate how loud it is. I am going to have to do something about the sound. I might do the dan hansen fan mod and add fans to the CPU heatsinks, like you suggest.


----------



## DanHansenDK

Hi Magic!!

_I got Ubuntu server 14.04.2 up and running and boinc installed with it being remotely managed by BoincTasks._

Way to go my friend!! I've been i "wonderland" trying to solve the script issue and getting the nvidia-smi command to behave!! Got some help and the problem is solved. Just have to build it into the WatchdogCpuTemp script.. More about that.. Right now I just wanna say "right on, dude"!!!

_CPU 1 runs much hotter than CPU 0._

It may be the airflow, but what do we do!? Did you try some of all Mr. Tex's suggestions???

_There is a piece of black plastic that covers CPU0 and CPU0 ram. It is not an optional piece. Without it, not enough airflow gets to CPU1 and CPU1 ram._

I'm guessing airflow as well! Did you check if all 5 fans in the front are running?? And, remove that plastic!! Even though it "directs" the airflow, it certainly doesn't help letting the heat out!! Away with it







And try again!!
Another thing. You need to check which fans are used. What are their RPM and how much are they really helping you. Like me, it's a good idea to look for a high powered fan. You may find fans which moves twice the amount of air for the same price! Check it out... I'll see what I can find too...



Mr. Tex wrote:
_Nice! Typical low-pro server setup. Looks to me as if you have room to put a couple "thin" fans on top of those heatsinks and not worry about the plastic piece._

I was thinking the same thing. Here's a bracket I made myself. I installed 5 fans on this and are using 2 of these in both my 1U webservers. I've opened one up for you to see what I mean. So, as mr. Tex writes there should be space enough













Here's what I mean. (sorry for the pour image):



Which again means this (sorry for the even worse image):



Regarding your new GPU it seems to fit!! Like with the CPU's you just need airflow!! When you've tested it, then I would very much like to hear the results. Maybe I should just choose 4 of those and then keep it cooled by air.

This wasn't all that wrong by the way











MagicSuperCruncher


So Magic, help me out, I'm dying here! Please let me know what it measures. How wide is it???

Question







Did you use my work to make multiple GPU's run or did you find another way???

.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic!!
> 
> _CPU 1 runs much hotter than CPU 0._
> 
> It may be the airflow, but what do we do!? Did you try some of all Mr. Tex's suggestions???


Yes, it is an airflow issue.
Quote:


> I'm guessing airflow as well! Did you check if all 5 fans in the front are running?? And, remove that plastic!! Even though it "directs" the airflow, it certainly doesn't help letting the heat out!! Away with it
> 
> 
> 
> 
> 
> 
> 
> And try again!!
> Another thing. You need to check which fans are used. What are their RPM and how much are they really helping you. Like me, it's a good idea to look for a high powered fan. You may find fans which moves twice the amount of air for the same price! Check it out... I'll see what I can find too...


There are 5 spaces for fans, but only 4 fans. They move a good deal of air. They spin at something like 11,000 rpm at full speed and sound like a jet taking off.
Quote:


> Mr. Tex wrote:
> _Nice! Typical low-pro server setup. Looks to me as if you have room to put a couple "thin" fans on top of those heatsinks and not worry about the plastic piece._
> 
> I was thinking the same thing. Here's a bracket I made myself. I installed 5 fans on this and are using 2 of these in both my 1U webservers. I've opened one up for you to see what I mean. So, as mr. Tex writes there should be space enough


I am going to put fans on top of the CPU heat sinks and build a cooler much like you have. I have some aluminum sheet from a previous project and it should be perfect for this.
Quote:


> Regarding your new GPU it seems to fit!! Like with the CPU's you just need airflow!! When you've tested it, then I would very much like to hear the results. Maybe I should just choose 4 of those and then keep it cooled by air.


I moved the 2nd video card and reinstalled the cooler and I ran the machine for 24 hours. The GPU with the better airflow averaged 65 C and the other one about 75C. I am going to have to rig some cooling for the GPU when I put it in the last slot. It will run too hot without some extra cooling.
Quote:


> So Magic, help me out, I'm dying here! Please let me know what it measures. How wide is it???
> 
> Question
> 
> 
> 
> 
> 
> 
> 
> Did you use my work to make multiple GPU's run or did you find another way???
> .


The video card with the shroud and fan removed is 1 in/2.54 cm wide.

I used your work. It took a long time to download everything, but once it installed, it worked great.


----------



## DanHansenDK

Hi Magic,

_I moved the 2nd video card and reinstalled the cooler and I ran the machine for 24 hours. The GPU with the better airflow averaged 65 C and the other one about 75C. I am going to have to rig some cooling for the GPU when I put it in the last slot. It will run too hot without some extra cooling._

OK... It was the one wothout the fan which went up to 75 only!?!?

I've got an idea my friend!! What about directing the air directly at the GPU which is/will be placed in "socket 0000:03" (decimal number 4). I'm going to find the thing I'm thinking at and then show you.. Give me a few minutes









Something like this. You can remove the bracket in the rear and then point the airflow where you want it. It's plastic and therefore you can mount it inside without any problems. Just a thought!?!?





_The video card with the shroud and fan removed is 1 in/2.54 cm wide._

Thanks a lot my friend... And I think it would fit nicely with 4 of those suckers









_I used your work. It took a long time to download everything, but once it installed, it worked great._

I'm proud like a newborn daddy








And shortly you will receivethe script which works on both CPU's and GPU's









Hey! BTW.... You've got multiple CPU's with multiple cores... Would you show me your output from "LM-sensors" ??? So that I can check that "WatchdogCpuTemp" works on your rig ???

.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> OK... It was the one wothout the fan which went up to 75 only!?!?


No, it was the new video card with the fan in a different slot.
Quote:


> Thanks a lot my friend... And I think it would fit nicely with 4 of those suckers
> 
> 
> 
> 
> 
> 
> 
> 
> 
> I'm proud like a newborn daddy
> 
> 
> 
> 
> 
> 
> 
> 
> And shortly you will receive the script which works on both CPU's and GPU's
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Hey! BTW.... You've got multiple CPU's with multiple cores... Would you show me your output from "LM-sensors" ??? So that I can check that "WatchdogCpuTemp" works on your rig ???
> 
> .


I am not running it right now because of the temps. When I do run it my family complains about the sound. I will run it some tonight for you







. This has motivated me to finish my shop. Once I get the rack in my shop, I will be able to run it 24/7


----------



## DanHansenDK

_This has motivated me to finish my shop._

Great!!! I'm running a shop as well build on a well known base







If you run into problems, well, then just let me know. I may have some answers for you









_When I do run it my family complains about the sound. I will run it some tonight for you_

I'm guessing fan-noise! Well, thats why I'm using a rack to mount it in. Never the less my test systems are also standing in our main-room making noise







My family complains as well







You may be ableto fit it in a smal rack like me. There's ot much space in the back. It's only 60.0 cm so it's not standard. But it doesn't take up that much space....

BTW, all I need is the output from the "lm-sensors" command: # sensors

Something like this:

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +58.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +56.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +54.0°C  (high = +80.0°C, crit = +100.0°C)

Maybe the kernel version as well:

Code:



Code:


# uname -a
Linux beaufort.mydomain.tld 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux


----------



## magic8192

Here you go then. Here are the temps from an earlier post








Quote:


> Originally Posted by *magic8192*
> 
> I managed to get the cuda drivers installed and the GPU crunching asteroids. managing the headless server remotely with boinctasks is very simple vs those crazy command line entries that you would have to do otherwise. The only problem with the DanHansenDK's install is that I had to put sudo in front of most of the commands to get them to execute. I do have some CPU temp issues that I will have to iron out.
> CPU 1 runs much hotter than CPU 0.
> coretemp-isa-0000
> Adapter: ISA adapter
> Core 0: +58.0°C (high = +80.0°C, crit = +96.0°C)
> Core 1: +59.0°C (high = +80.0°C, crit = +96.0°C)
> Core 2: +56.0°C (high = +80.0°C, crit = +96.0°C)
> Core 8: +54.0°C (high = +80.0°C, crit = +96.0°C)
> Core 9: +57.0°C (high = +80.0°C, crit = +96.0°C)
> Core 10: +60.0°C (high = +80.0°C, crit = +96.0°C)
> 
> coretemp-isa-0001
> Adapter: ISA adapter
> Core 0: +75.0°C (high = +80.0°C, crit = +96.0°C)
> Core 1: +74.0°C (high = +80.0°C, crit = +96.0°C)
> Core 2: +72.0°C (high = +80.0°C, crit = +96.0°C)
> Core 8: +75.0°C (high = +80.0°C, crit = +96.0°C)
> Core 9: +75.0°C (high = +80.0°C, crit = +96.0°C)
> Core 10: +76.0°C (high = +80.0°C, crit = +96.0°C)
> 
> There is a piece of black plastic that covers CPU0 and CPU0 ram. It is not an optional piece. Without it, not enough airflow gets to CPU1 and CPU1 ram. The interesting thing is that the machine will automatically throttle the CPU if temps get too high and you get a red light on the front of the case and a command line warning in Linux. I would think that I would have probably got some kind of a popup if I was running a GUI. The above temps are with the black plastic installed.


To clarify about the GPUs. I put the fan back on the MSI card and it was still running at 75 C with the fan on it. I don't think the side of the case where the PCI-E slots are has very good air flow.


----------



## DanHansenDK

Hi Magic,

Thanks for that








I'll take that into account when making the next version. "WatchdogCpuTemp version 0.1.8" that is.

Making the "WatchdogGpuTemp.sh" script work!
Any of you guys who knows a little about shell scripts? We are currently having trouble finding a way to set 2 strings. "str" and "newstr" where "str" is the GPU number and "newstr" which is the temperature.

Here's the command which works perfectly and shows the GPU output the same way as "sensors" from LM-sensors shows the CPU info/output:

Code:



Code:


# nvidia-smi -q -d temperature | grep GPU | perl -pe '/^GPU/ && s/\n//' | grep ^GPU

And the output:

Code:



Code:


# nvidia-smi -q -d temperature | grep GPU | perl -pe '/^GPU/ && s/\n//' | grep ^GPU
GPU 0000:01:00.0        GPU Current Temp            : 53 C
GPU 0000:02:00.0        GPU Current Temp            : 45 C
GPU 0000:03:00.0        GPU Current Temp            : 52 C
GPU 0000:04:00.0        GPU Current Temp            : 51 C

This is the nearest I've managed so far:

Code:



Code:


str=$(nvidia-smi -q -d temperature | grep GPU | perl -pe '/^GPU/ && s/\n//' | grep ^GPU "GPU 0000:0$i:00.0")
newstr=${str:54:2}

But, it doesn't work just yet. If any of you know shellscripting, you could help a lot by solving this issue!! If, then let me know and I'll show you the newest version of the script/GPU-version


----------



## DanHansenDK

Hello









I've got great news! I solved the problem! I've managed to get the GPU's checked the same way as with the CPU's. I've tested the script and i runs nicely. Here's a few outputs. Read it and weep









Code:



Code:


# ./watchdoggputemp.sh 50 60
JOB RUN AT Mon Jun 29 00:06:32 CEST 2015
=======================================

GPU Warning Limit set to => 50
GPU Shutdown Limit set to => 60

GPU 0: GeForce GTX 750 Ti      GPU Current Temp: 49 C
GPU 1: GeForce GTX 750         GPU Current Temp: 39 C
GPU 2: GeForce GTX 750         GPU Current Temp: 45 C
GPU 3: GeForce GTX 750 Ti      GPU Current Temp: 47 C

 Temperature GPU 0 OK at => 49

 Temperature GPU 1 OK at => 38

 Temperature GPU 2 OK at => 45

 Temperature GPU 3 OK at => 47

Status - All GPUs are within critical temperature limits

And it log's as it is suppose to! Both "notifying mail + log" & "warning mail + log + critical shutdown" works as highlighted in red box'es











So now it's just to run

Code:



Code:


# crontab -e

and set it up like "WatchdogCpuTemp.sh". Like this:



And now we can sleep like babies, knowing our CPU's but even more important our GPU's will not get fried









This is it for this weekend. My head is spinning and I need rest!! I'll be back shortly and then we'll press on









.


----------



## magic8192

I don't have a lm-sensors command. I am using the sensors command. Here is the complete output of the command

Code:



Code:


@X5670-2P:~$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +39.0°C  (high = +80.0°C, crit = +96.0°C)
Core 1:       +41.0°C  (high = +80.0°C, crit = +96.0°C)
Core 2:       +36.0°C  (high = +80.0°C, crit = +96.0°C)
Core 8:       +44.0°C  (high = +80.0°C, crit = +96.0°C)
Core 9:       +46.0°C  (high = +80.0°C, crit = +96.0°C)
Core 10:      +47.0°C  (high = +80.0°C, crit = +96.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 0:       +61.0°C  (high = +80.0°C, crit = +96.0°C)
Core 1:       +71.0°C  (high = +80.0°C, crit = +96.0°C)
Core 2:       +69.0°C  (high = +80.0°C, crit = +96.0°C)
Core 8:       +71.0°C  (high = +80.0°C, crit = +96.0°C)
Core 9:       +61.0°C  (high = +80.0°C, crit = +96.0°C)
Core 10:      +72.0°C  (high = +80.0°C, crit = +96.0°C)

w83627dhg-isa-0a10
Adapter: ISA adapter
Vcore:        +0.92 V  (min =  +0.00 V, max =  +1.74 V)
in1:          +0.88 V  (min =  +0.10 V, max =  +1.06 V)
AVCC:         +3.34 V  (min =  +2.98 V, max =  +3.63 V)
+3.3V:        +3.34 V  (min =  +2.98 V, max =  +3.63 V)
in4:          +1.15 V  (min =  +1.78 V, max =  +0.28 V)  ALARM
in5:          +0.89 V  (min =  +1.00 V, max =  +1.64 V)  ALARM
in6:          +0.88 V  (min =  +1.66 V, max =  +0.49 V)  ALARM
3VSB:         +3.34 V  (min =  +2.98 V, max =  +3.63 V)
Vbat:         +3.26 V  (min =  +2.70 V, max =  +3.63 V)
fan1:           0 RPM  (min = 2109 RPM, div = 128)  ALARM
fan2:           0 RPM  (min = 10546 RPM, div = 128)  ALARM
fan3:           0 RPM  (min =  811 RPM, div = 128)  ALARM
fan5:           0 RPM  (min = 1054 RPM, div = 128)  ALARM
temp1:        +34.0°C  (high =  +4.0°C, hyst = +94.0°C)  sensor = thermistor
temp2:        -29.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
temp3:        +32.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
cpu0_vid:    +0.000 V
intrusion0:  ALARM


----------



## DanHansenDK

Hi Magic,

Well, the command used by "LM-sensors" IS "sensors":

Code:



Code:


# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +60.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:         +57.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:         +55.0°C  (high = +80.0°C, crit = +100.0°C)

There may be another explanation to this, but I doubt very much that a command is taken over by "LM-sensors".

Code:



Code:


NAME
sensors - print sensors information
SYNOPSIS
sensors [ options ] [ chips ]
sensors -s [ chips ]

DESCRIPTION
sensors is used to show the current readings of all sensor chips.
sensors -s is used to set all limits as specified in the configuration file.

sensors knows about certain chips, and outputs nicely formatted readings for them; but it can also display the information of unknown chips, as long as libsensors knows about them.

OPTIONS
Tag     Description
-c config-file  Specify a configuration file. If no file is specified, '/etc/sensors.conf' is used. Use '-c /dev/null' to temporarily disable this default configuration file.
-h      Print a help text and exit.
-s      Evaluate all 'set' statements in the configuration file and exit. You must be 'root' to do this. If this parameter is not specified, no 'set' statement is evaluated.
-A      Do not show the adapter for each chip.
-U      Hide unknown chips.
-u      Treat all chips as unknown ones. Output will be of much lower quality; this option is only added for testing purposes.
-v      Print the program version and exit.
-f      Print the temperatures in degrees Fahrenheit instead of Celsius.
FILES
/etc/sensors.conf The system wide configuration file. See sensors.conf(5) for further details.
CONFORMING TO
lm_sensors-2.x
SEE ALSO

AUTHOR
Frodo Looijaard and the lm_sensors group http://www.lm-sensors.org/

https://help.ubuntu.com/community/SensorInstallHowto

Question!! Are you running CUDAx.x / with the build-in Nvidia-driver? Said in another way. Did you use my ToDo to the letter? If, I'm surprised to see temperature indication using the "sensors" command (lm-sensors). The reason why I had to use a whole weekend to modify the WatchdogXxxTemp.sh script was because the installation of CUDAx.x/build-in Nvidia-driver conflicts with "xxx.isa.xxxx" and the output of the sensors command.

.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> Well, the command used by "LM-sensors" IS "sensors":
> 
> Question!! Are you running CUDAx.x / with the build-in Nvidia-driver? Said in another way. Did you use my ToDo to the letter? If, I'm surprised to see temperature indication using the "sensors" command (lm-sensors). The reason why I had to use a whole weekend to modify the WatchdogXxxTemp.sh script was because the installation of CUDAx.x/build-in Nvidia-driver conflicts with "xxx.isa.xxxx" and the output of the sensors command.
> 
> .


I did everything in your script. The first time I ran sensors, it told me to run the sensors-detect program and it set everything up. I dunno?


----------



## Tex1954

Quote:


> Originally Posted by *magic8192*
> 
> I did everything in your script. The first time I ran sensors, it told me to run the sensors-detect program and it set everything up. I dunno?


LM-Sensors isn't perfect, but it's pretty good...

Still, I like the GUI myself... PSensor uses LM-Sensors and does a nice display you can edit...


----------



## DanHansenDK

Well helloo Mr. Tex









It's CLI servers, not GUI controlled desktops.. I dont know if you can control/see output on a server with the GUI controlled software?? Do you know that Tex??? No it's far from perfect, which I had to use a whole weekend to solve the problem of convert my WatchdogCpuTemp.sh script to nvidia-smi output #(/&%¤#(/&/()=%&!£ My G.., I still dream about it man









Hi Magic









_"....I did everything in your script. The first time I ran sensors, it told me to run the sensors-detect program and it set everything up. I dunno?..."_

In my ToDo (Script???) This has to be a old version... I have a new version, still not complete, but better anyway.. Which page did you take it from. Is it the last time I put it inhere, little more than a week ago? Or is it further back???


----------



## magic8192

This is what I used


----------



## DanHansenDK

Hi Magic,

Thats just fine. I thought it was an old version because you mentioned sensors/sensors-detect. The full ToDo isn't quite done and lm-sensors(sensors) is a part of it. That made me believe I had shown an old version of it. No worries my friend, you are right on track


----------



## DanHansenDK

Hello friends,

Just wanted to hear if crunchers are running, still









BTW, I re-read some posts, and I will have to post a note here








Quote:


> I managed to get the cuda drivers installed and the GPU crunching asteroids. managing the headless server remotely with boinctasks is very simple vs those crazy command line entries that you would have to do otherwise. The only problem with the DanHansenDK's install is that I had to put sudo in front of most of the commands to get them to execute. I do have some CPU temp issues that I will have to iron out.


I will update the ToDo, with scripts and everything... I'm building the some of the last scripts these days. It's shown in another thread, but the scripts will be used here as well.

Regarding the problems running my commands, I'll have to say "Sorry, man!". I forgot to say/show that these commands are best being used after logging in as root. Take SU rights with "sudo su" and then it's up and away







I thought the "#" would show this. Not logged in as root shows "$" I think... Sorry for that







)

Please let me hear how everything runs









Kind Regards,
Dan


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hello friends,
> 
> Just wanted to hear if crunchers are running, still
> 
> 
> 
> 
> 
> 
> 
> 
> 
> BTW, I re-read some posts, and I will have to post a note here
> 
> 
> 
> 
> 
> 
> 
> 
> I will update the ToDo, with scripts and everything... I'm building the some of the last scripts these days. It's shown in another thread, but the scripts will be used here as well.
> 
> Regarding the problems running my commands, I'll have to say "Sorry, man!". I forgot to say/show that these commands are best being used after logging in as root. Take SU rights with "sudo su" and then it's up and away
> 
> 
> 
> 
> 
> 
> 
> I thought the "#" would show this. Not logged in as root shows "$" I think... Sorry for that
> 
> 
> 
> 
> 
> 
> 
> )
> 
> Please let me hear how everything runs
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Kind Regards,
> Dan


I'm still very interested in all you are doing... although I don't really see any performance degradation using a "SIMPLE" GUI... and TightVNC lets me see what is happening at a glance should the need arise.

Can't wait for more pics and news!


----------



## DanHansenDK

Hi Tex,

Sorry, I've been "out" for a while









I'm rebuilding the RACK so that these machines can run from inside the rack. Had the same problem as magic... Noise







So, I've been buing large fans which I'm installing in the bottom of the RACK and getting those MOLEX connectors from the states so that I can finish the system.. It's frustrating that they haven't been able to run for a time. I've been planning for a version 3 of the cruncher. I just had a busy half year at the university and an operation in the arm. A bus ran me over some time ago, so I had some personal stuff to attend to









Well, I hope you are well and still going strong...
I'll be back, very soon









KR
Dan


----------



## magic8192

I love the 750ti. These cards stack up very well against any card in the PPD/Watt area and because of the low power, you can use inexpensive server equipment without any modification. I plan on using the 750ti in my server setups that I am working on. Thanks for all of this Dan!


----------



## DanHansenDK

Hi Magic,

Which version of the 750ti ??? Brand, that is.... As I wrote ealier on, my choice wadn't that smart. First I chose the slow model and then a brand with a cargoship sized heatsink.

Hope your project is doing great.

Kind Regads,
Dan


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Magic,
> 
> Which version of the 750ti ??? Brand, that is.... As I wrote ealier on, my choice wadn't that smart. First I chose the slow model and then a brand with a cargoship sized heatsink.
> 
> Hope your project is doing great.
> 
> Kind Regads,
> Dan


I have two different versions of the 750 ti
MSI GeForce GTX 750 Ti
GIGABYTE GeForce GTX 750Ti


----------



## DanHansenDK

Hello my frikend,

OK, but which is the most narrow? Didn't we talk this over some time back? I think it was the MSI, which has a narrow heatsink, right?

KR
Dan


----------



## magic8192

The MSI card is the one we were discussing yes.


----------



## DanHansenDK

Hi,

Thanks... Just ordered it MSI N750TI-2GD5TLP grafikkort - GF GTX 750 Ti - 2 GB ;-) looking forward to install a 4'th card Ib test-system 3. Project has been stalled because the brand I did buy was to thick for a 4'th card to be installed. Looking forward to test it with 4 cads.

KR
Dan


----------



## Tex1954

I've been driving around for 3 months and home now. Still following your stuff Dan and Magic! Very interesting!

For me, I just try to avoid people trying to kill themselves on my front bumper...

Here is an example...

https://www.youtube.com/watch?v=x6RwEdLBxBg

In any case, life goes on and you can see why I use a good quality dash-cam...


----------



## Finrond

Quote:


> Originally Posted by *Tex1954*
> 
> I've been driving around for 3 months and home now. Still following your stuff Dan and Magic! Very interesting!
> 
> For me, I just try to avoid people trying to kill themselves on my front bumper...
> 
> Here is an example...
> 
> https://www.youtube.com/watch?v=x6RwEdLBxBg
> 
> In any case, life goes on and you can see why I use a good quality dash-cam...


never understood how people can be so dumb. Looks like everyone was OK though.


----------



## magic8192

Quote:


> Originally Posted by *Tex1954*
> 
> I've been driving around for 3 months and home now. Still following your stuff Dan and Magic! Very interesting!
> 
> For me, I just try to avoid people trying to kill themselves on my front bumper...
> 
> Here is an example...
> 
> https://www.youtube.com/watch?v=x6RwEdLBxBg
> 
> In any case, life goes on and you can see why I use a good quality dash-cam...


I guess he thought you would move out of his way? Be safe out there!


----------



## Tex1954

I had just turned my head to check the trailer tandem clearance to time my turn... WIDE turns you know.... He chose that moment to try to turn in front of me instead of driving half a block to turn around safely...

He had an expired license, no insurance, worn thru the steel belt tires, and a previous ticket for driving with an expired license. Zero damage to my truck, but exploded his left rear tire on the corner of my bumper... Officer said he would be ticketed for at least the expired license and no insurance. He was a much older man and I think probably on a fixed income of not much. I felt very bad for him myself, just glad nobody was hurt. People do that to me all the time, especially in Laredo,TX... Up to this point, I managed to smash the brakes in time to avoid hitting them.... but this time because I had to watch my tandem position more closely, I missed seeing him in time to stop.

Sigh.... could have been worse I suppose, but I was barely moving forward... maybe 5-7 MPH or so... still feel bad about it. First time I ever hit anybody ever.... in all vehicles... I've been hit many times, dozens.... but always in the rear or while parked. I just felt and still feel bad about it.

You have to know, the police in all states and the federal government expect the "Motoring Public" to drive and make a lot of mistakes... BUT, they expect professional drivers to anticipate this poor driving and take steps to avoid accidents. A good driver is always aware of what is happening all the time and is "supposed" to take steps to minimize or avoid conflicts. That is why a car driver can win a lawsuit against a trucking company even if the truck driver was not at fault technically... If there is anything the truck driver could have done to avoid the accident, it is expected and demanded that the driver do so or it is the professional drivers fault...

Super defensive and as safe as possible... so you see, I feel like I failed in my job as well... One second to check my tandems in a turn and BAM! My fault...

Sigh... life is like that...


----------



## magic8192

Were you able to replay the video for the cop?


----------



## Tex1954

Quote:


> Originally Posted by *magic8192*
> 
> Were you able to replay the video for the cop?


Of course, he watched it on my laptop. He sent me on my way while he started writing tickets against the other guy. Of course, our safety departed got a copy as well as pictures of everything and all the police information... all is well with me except my feel really bad about it.

I drive _*40 tons*_ of *"can't stop fast"* and know it and try to do my best... and I failed this time... and the OTHER guy will have to pay the price because of my mirror timing... lesson learned... next time I stop until car passes I think...










Info Blip. Just my tractor with me and all my stuff and full tanks weighs 19,000 Lbs. There is more weight on ONE front tire than what my F150 extended cab P/U truck weighs fully loaded... add a heavy load and trailer you get 6,000 Lbs on each front tire, 8,500 Lbs on each set of tires on the rear tractor and trailer tires. I could have run over and completely smashed that truck in not much different circumstances... frightening to think about... and upset me...


----------



## DanHansenDK

My god Tex!!! What are those baboons doing????
I'm driving a cab myself to be able to stay at Uni, so I see a lot too. And thats a total idiotic thing to do.
Damn....


----------



## DanHansenDK

Just Got the MSI card... Setting it up tomorrow. 4x gtx750ti
Magic, did you remove the fan on your MSI ????

KR
Dan


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Just Got the MSI card... Setting it up tomorrow. 4x gtx750ti
> Magic, did you remove the fan on your MSI ????
> 
> KR
> Dan


I did better. I got another 2u server and now I have 2 in each


----------



## DanHansenDK

Ok, but you did something on the other system? Well, will tale a hike down "memory Lane" ;-)

Well, all 3 systems, all 12 gpu's will be put to work after being mounted in my rack. Loockng so Munch forward to crunch again. Had same problem as you did once. The noise

KR
Dan


----------



## spdaimon

Quote:


> Originally Posted by *Tex1954*
> 
> Of course, he watched it on my laptop. He sent me on my way while he started writing tickets against the other guy. Of course, our safety departed got a copy as well as pictures of everything and all the police information... all is well with me except my feel really bad about it.
> 
> I drive _*40 tons*_ of *"can't stop fast"* and know it and try to do my best... and I failed this time... and the OTHER guy will have to pay the price because of my mirror timing... lesson learned... next time I stop until car passes I think...
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Info Blip. Just my tractor with me and all my stuff and full tanks weighs 19,000 Lbs. There is more weight on ONE front tire than what my F150 extended cab P/U truck weighs fully loaded... add a heavy load and trailer you get 6,000 Lbs on each front tire, 8,500 Lbs on each set of tires on the rear tractor and trailer tires. I could have run over and completely smashed that truck in not much different circumstances... frightening to think about... and upset me...


I wouldn't be too upset, but I can understand your feelings. Its a *** were you thinking on the other drivers part. I have a lot of respect (or you might call it fear) of trucks. I don't stay anywhere on the right side of the trucks any longer than I have to get by them. Not saying I am passing on the right, just if I was in that similar situation. People don't think. One time I got beeped at because I was hanging back to let the truck merge on a road..guess I was suppose to try to go past it. Me first! Idiots. Rather let it go in front of me then get in front of it.


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> Ok, but you did something on the other system? Well, will tale a hike down "memory Lane" ;-)
> 
> Well, all 3 systems, all 12 gpu's will be put to work after being mounted in my rack. Loockng so Munch forward to crunch again. Had same problem as you did once. The noise
> 
> KR
> Dan


If you remove the fan from the card, you will need to get a good bit of airflow over the card to keep it cool.
You need to do this before the pentathlon!


----------



## Finrond

Quote:


> Originally Posted by *DanHansenDK*
> 
> My god Tex!!! What are those baboons doing????
> I'm driving a cab myself to be able to stay at Uni, so I see a lot too. And thats a total idiotic thing to do.
> Damn....


Damn dude, driving a mercedes as a taxi? That's badass.


----------



## mm67

Quote:


> Originally Posted by *Finrond*
> 
> Damn dude, driving a mercedes as a taxi? That's badass.


In Finland 31.2 % of taxi cars are from Mercedes, I assume situation is quite similar in Denmark


----------



## DanHansenDK

Hi huy's,

I'm having an old problem again. When trying to start beaufort (project testsystem 1) the fans and psu fan just starts for a second and then stops, starts for a second and stops again etc. etc. So the system will not run. Any ideas?

Damn iPad spellcheck €$£#}\?!£*%#<€$

Hi Finrond,

Yes, and I really like driving that sucker!! Real Nice workplace.









MM67

Taxi'es in Denmark are getting "ordinary" !! It must be about 1/4 of them which is still Mercedes. Renault, VW, Nissan. Quite a bit of them are VW. VW Passat stationcar is quite nice, but the others, no way! Not for the cost of 3 € each km. You need to be able to give people a nice ride in comfortable conditions. And in a car not parked in anyone's garage.


----------



## DanHansenDK

Hi guys,

I'm transferring a case I Got through Seti. I Think it pretty relevant to this project. Please dont "forget" my problem above









From Seti:

Hi Dan

Did you figure out if your GT 640 is supported? I have ubuntu mate 15.10, latest nvidia driver (352.63), and a GT 640. BOINC does not 'see' the GPU. Note, I have another system with win7 and a GT 640 and boinc uses it. Any suggestions?

Many thanks

Hi P,

Did you solve it? I did manage to make it work. Pretty well actually. You need to do a few things. If you didn't solve it yet, I've got a few tips for you. I made a todo, step-by-step, to make it work with multiple gpu's. I'm at work, but I brought my iPad, så I can help you if needed.

Kind Regards,
Dan

Hi Dan

I installed cuda-7.5 from the developer.nvidia.com site, boinc sees the GPU now but there are no cuda workunits for linux yet. Note: my win7pro with the same GT 640 card has plenty of WUs. Please share your todo, I appreciate it.

Paul

Hi Paul,

Did you use the .deb (debian) fifle and did you dpkg it?... It took me almost half a year to solve, but this will work!! And it will work with multiple gpu's

http://www.overclock.net/t/1467918/project-headless-linux-cli-multiple-gpu-boinc-server-ubuntu-server-12-04-4-14-04-1-64bit-using-gpus-from-geforce-gt610-gt640-gtx750ti-to-crunch-data/240#post_24039095

Enjoy....

Kind Regards
Dan

Hi Dan

Thank you for the link. I think I did the right steps, either dpkg -i the .deb file or there is a .run file from NVIDIA which installs the whole /usr/local/cuda-7.5 tree. Since we don't have SETI cuda WUs for Linux yet, I will connect to another project to verify the GPU works. The event log of SETI shows:

Tue 29 Mar 2016 02:27:04 PM MST | | CUDA: NVIDIA GPU 0: GeForce GT 640 (driver version 352.63, CUDA version 7.5, compute capability 3.0, 1023MB, 1003MB available, 730 GFLOPS peak)

Tue 29 Mar 2016 02:27:06 PM MST | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz [Family 6 Model 60 Stepping 3]

Tue 29 Mar 2016 02:27:06 PM MST | | OS: Linux: 4.2.0-34-generic

I'm not ready yet for multiple GPUs ...

Hi,

But theres a third party config which has done that... I'll look it up from my pc at home.

D

Ongoing work here.....


----------



## DanHansenDK

Hi,

Some reading:

http://setiathome.berkeley.edu/forum_thread.php?id=66123

Guys, do you have a suggestion for Paul?

Looking for that third party stuff still!

KR
Dan


----------



## magic8192

Quote:


> Originally Posted by *DanHansenDK*
> 
> When trying to start beaufort (project testsystem 1) the fans and psu fan just starts for a second and then stops, starts for a second and stops again etc. etc. So the system will not run. Any ideas?


My last computer that had that problem had a bad motherboard. Are there any beeps or other codes that give you a hint at the problem? If you don't have a case speaker hooked up, hook it up to see if you get a beep code. Some boards will beep a sequence of long and short beeps that you can look in the manual to determine the error.

What is the last thing you did before the problem started? It could be a power issue where the system isn't getting enough power. Make sure all the motherboard power connectors are hooked up and borrow a beefier power supply to see if that helps.

It could also be a failed CPU or memory, try reseating all the components and if none of that works, remove everything except the CPU and one memory stick. If that doesn't work try a different CPU and memory. If that doesn't work, it is probably the board.


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi huy's,
> 
> I'm having an old problem again. When trying to start beaufort (project testsystem 1) the fans and psu fan just starts for a second and then stops, starts for a second and stops again etc. etc. So the system will not run. Any ideas?


That is typical of systems that fail some part of the boot up process. Overclocking, bad memory, SSD/HD that doesn't respond completely... IDE trying to boot AHCI and visa-versa... something isn't happy.

Often when a system has been sitting unused a long time, the CMOS can get corrupt. Obvious first thing is reset BIOS to defaults and try again.

Second thing, something aged... Magic tells how to see what it could be.

Bad SATA cables have caused this too...

Updating BIOS can sometimes fix things...

Good luck!


----------



## bfromcolo

Quote:


> Originally Posted by *magic8192*
> 
> My last computer that had that problem had a bad motherboard.


Same here, but that was a desktop. It also occasionally would give memory error beep codes.


----------



## DanHansenDK

Hi Tex, Magic & BFrom...

I've got a small speaker attached allways.

The problem has occured when the 2 testsystems was in the testbench too. I'm not sure if its a problem with the PSU's or just too little juice available. But it sounds right.
Because more than onde system has got the problem you could normally rule the Mobo issue out, but as you know, I'm using the the exact same mobos on all 3 testsystems. Same SSD, same memory etc. etc. So its hart to say. Its just that theres only 1x 150watts, 2x 300watts and then this PSU 300watts maximum. Even if they ran at 100% all of them, there should be enough juice!?!?

It might just be the Mobo.

KR
Dan


----------



## DanHansenDK

Hi,

Still testing....

Testsystem 3 version 2 should be ready tuesday/wednesday and will be running 4 times NVIDIA GTX750ti... Or 2 times ti and 2 times non-ti. Just to see which is the stable one. No, I made an retor back then ordering them, but it sounded much better









BTW, if any of you needs those Shell-scripts they are done (Linux guys) :

Question!! Anyone know how the fæl. I cancel/deselect spellchecking on an iPad?

Jeres a Line without forresten ---> was my attempt to write, here's a line without spellcheck









KR
Dan


----------



## DanHansenDK

Its a sad time BTW.. My ferret Buster which is my avatar died the other day.. This is the reason that I didn't finish the correnspondance I started with Pavlos(Paul)







Sorry to bring it up, just thought a reason for the unfinished correnspondance was needed









OK..


----------



## Tex1954

Dang!

Sorry to hear about Buster... into animal Heaven he went like all critters...

Condolences...









Don't discount PSU's... even a 300W one can screw up a boot if it droops badly at startup... I have two 500W PSU's in my house that do that, both of them Thermaltake POS units I got cheap...

Bad battery on Motherboard can do that too...

Good Luck!


----------



## Finrond

My process for diagnosing these types of issues is to just keep disconnecting things until it boots, then swap memory with known good, swap psu with known good, etc... until you are able to narrow down the possibilities.


----------



## magic8192

Sorry to hear about Buster. My condolences.


----------



## DanHansenDK

Quote:


> Originally Posted by *Tex1954*
> 
> Dang!
> 
> Sorry to hear about Buster... into animal Heaven he went like all critters...
> 
> Condolences...
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Don't discount PSU's... even a 300W one can screw up a boot if it droops badly at startup... I have two 500W PSU's in my house that do that, both of them Thermaltake POS units I got cheap...
> 
> Bad battery on Motherboard can do that too...
> 
> Good Luck!


Hi Tex,

Thanks my friend...

Damn, isn't it possible to disable spellchecking for iPads. My G.. ,!!

Regarding the PSU, it's a industrial design which usually works, but not right now. I think its a bad design in Mobo Technology as you guys suggested ealier on. Its just that I usually ran these systems, as you know, with no problems. This problem was detected on testsystem 1 with another type of ASRock board. I'm still trying to test and find the problem...

KR
Dan


----------



## magic8192

@DanHansenDK
Since the motherboard is bad, just solve the problem and get a supermicro X9DR3-LN4F+ or ASUS Z9PE-D8 WS and a pair of those cheap E5-2670 CPU's from Ebay. Your current non ECC memory may even work.


----------



## DanHansenDK

Hi Finrond,

_
My process for diagnosing these types of issues is to just keep disconnecting things until it boots, then swap memory with known good, swap psu with known good, etc... until you are able to narrow down the possibilities._

And thats just the way to do it. Memory, CPU, OS/SSD didn't do the trick. Trying psu now, but its weird


----------



## DanHansenDK

Quote:


> Originally Posted by *magic8192*
> 
> @DanHansenDK
> Since the motherboard is bad, just solve the problem and get a supermicro X9DR3-LN4F+ or ASUS Z9PE-D8 WS and a pair of those cheap E5-2670 CPU's from Ebay. Your current non ECC memory may even work.


My G.. its Christmas ;-)

Looks like a plan for Headless Multiple ...... Cruncher vers. 4 Have to look into this. Those Asus boards are very expensive, but I started version 2 with a very expensive Asus Mobo as well.

Well, a reason to solve this boring issue ASAP ;-)


----------



## DanHansenDK

Quote:


> Originally Posted by *magic8192*
> 
> Sorry to hear about Buster. My condolences.


Magic, thanks my friend. Sorry to go completely "Facebook" and all, but you know how it is... Thanks..


----------



## Tex1954

Quote:


> Originally Posted by *DanHansenDK*
> 
> Hi Tex,
> 
> Thanks my friend...
> 
> Damn, isn't it possible to disable spellchecking for iPads. My G.. ,!!
> 
> Regarding the PSU, it's a industrial design which usually works, but not right now. I think its a bad design in Mobo Technology as you guys suggested ealier on. Its just that I usually ran these systems, as you know, with no problems. This problem was detected on testsystem 1 with another type of ASRock board. I'm still trying to test and find the problem...
> 
> KR
> Dan


Umm, sometimes a bad mobo battery causes problems... Hope the board is still under warranty...


----------



## DanHansenDK

Hi Tex,

Long time no talk!?!?









Hi, thanks... I'm back doing projects until august where I'll have to be back on Uni....

I didn't solve the problem regarding the surge or what we'll have to call this problem. Anyway, I'm going to Roskilde now, to buy several PSU's, a new 640 card and some grounded power cables as well. This way I'll be able to rule out things thats not causing the problem.

I've build a RACK box whith lot of cooling and a pipe venting the air out from the house. This way I will be able to run a lot more servers. 2 Version 1 will be running, the version 2 when finished and the new version 3 (still under design)







More about this later









I hope you are all well and that we'll speek soon again









Kind Regards,

Dan


----------



## DanHansenDK

Status TestSystem "Halifax" version 2

Dismounted all 4 GPU's, changed the CMOS bat, but still no change. So it's not low watts due to the 300watts PSU on this test system. And it's not because of the CMOS battery either. I'll tested power sources too and this wasn't it. Wellington, TestSystem 1 had a few "hickups" but did run after a few restarts. That system is running full capacity now. I just dont get it.

KR
Dan


----------



## DanHansenDK

OK,

Tested everything execpt a new larger PSU. i.e. a 2U PSU industrial like the one "Test System 3 - Beaufort" uses. I'll try to test it with this. I've tested everything. Unplugged everything and even thought I new it wasn't because of memory, I tested with new ones. The problem is "before" the boot up failure sequence. So it has to be component error (defect mobo) or the PSU. I've thought about this power issue a lot and this was my possible solution to the problem.

STATUS:

1. Fix Test System 2 (mobo/PSU issue) and make it run with the version 2.0 "software setup" (with all WatchDog* shellscripts running)
2. Finish the build of "Test System 3 - Beaufort" with 2x GTI750, 2x GTI750ti (the 4th one a card like magic's because of spce issues.) This way we'll get the stability between different manufacturers/models tested as well.
3. Discuss and design Test System 4 with new software ideas as well (scripts doing *****)









That being said... I can allways install 3 new mobos for TestSystem 1, 2 and 3. The new project, TestSystem 4 will be a whole new ballgame. Still cost-efficient, same idea, but one step up the ladder









Suggestions for a 4+ SLI mobo, will be greatly appreciated
or
should we go for a system with one large GPU?

Idea:
ASUS P10S WS Bundkort - Intel C236 - Intel LGA1151 socket - DDR4 RAM - ATX
Producent: ASUS
Model: 90SB05T0-M0EAY0
Ean: 4712900268348



_Ongoing work....._


----------



## DanHansenDK

Hello again









Because of the new version of Ubuntu Server, we will be fitting "Test System 3 alias Beaufort" to run 16.04.x and the newest version of CUDA. This way, Test System 4 issues will be hardware related and not because of changes in software designs. Test System 1 & 2 will keep the 14.04.x

*CUDA 8.0 - Linux/.deb* :

Operating System: Linux
Architecture: X86_64
Distribution: Ubuntu
Version: 14.04
Installer Type: .deb

Download Installer for Linux Ubuntu 14.04 x86_64
Base Installer
Installation Instructions:

Code:



Code:


`sudo dpkg -i cuda-repo-ubuntu1404_8.0.44-1_amd64.deb`
`sudo apt-get update`
`sudo apt-get install cuda`

Use the "wgett" command from my ToDo and replace the URL with:

Code:



Code:


Command: # wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_8.0.44-1_amd64.deb

Operating System: Linux
Architecture: X86_64
Distribution: Ubuntu
Version: 16.04
Installer Type: .deb

Download Installer for Linux Ubuntu 16.04 x86_64
Base Installer
Installation Instructions:

Code:



Code:


`sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb`
`sudo apt-get update`
`sudo apt-get install cuda`

Use the "wgett" command from my ToDo and replace the URL with:

Code:



Code:


Command: # wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb

.


----------



## DanHansenDK

*Status 11.12.16 05:24pm:*

# Watchdog* ShellScripts:

Regarding the scripts "#WatchdogCpuTemp.sh v.0.1.6 & WatchdogGpuTemp.sh v.0.1.7 - These works as they are suppose to now!! To find the right command for the GPU script, we did do a great deal of work... But it runs now. We could have chosen one of the first solutions, but these would have had to be modified along with updates of different Linux distros.

OK, "Test System 2 aka Halifax" will be finished, when I've bought new 510watts 2U PSU's with 24connector/6+2. If its because of the 300watts PSU and to low powersuply or if its due to the missing connectors, I don't know, but this is the last possibility if the mobo works for this at all. It ran for some time, so the setup works. But it did have some hickups as you might remember









Back to "Test System 3 aka Beaufort"

*Status "Test System 3" aka "Beaufort" - 11.12.16 05:38pm:*

It's time to install the 4'th GPU, a MSi GeForce GTX 750 Ti. Like Magic, we had a problem with the first versions of Low Profile GPU/Cards *GTX750Ti. The model from KFA2 has a large heatsink and a fan submitted into it. This model, the one from MSi has fans to, 2 actually, but these are mounted on top of the heatsink. Removing these will make it fir into the remaining PCI-e socket. (I hope)







_....ongoing work







_

_I'll enter the images showing the work later on. Has to take brake here







_

In this Test System I use the larger 2U 510watts industrial PSU. And there's been no issues running so far. The reason I mention this, is because of the problems regarding "Test System 2 aka Halifax".

Code:



Code:


Dimensioner  70 x 100 x 240 mm
SATA    3
Min. effekt     56 W
Molex   2
PCI-Express     6+2 pins
20/24 pin       24
Effekt  510 W
Maks. effekt    510 W
Forsyningsspænding      103- 264 VAC

2U Industrial 510 W ATX P4, PCI-Express & SATA:


----------



## DanHansenDK

*Status "Test System 3" aka "Beaufort" - 11.12.16 05:38pm:*

It's time to install the 4'th GPU, a MSi GeForce GTX 750 Ti. Like Magic, we had a problem with the first versions of Low Profile GPU/Cards *GTX750Ti. The model from KFA2 has a large heatsink and a fan submitted into it. This model, the one from MSi has fans to, 2 actually, but these are mounted on top of the heatsink. Removing these will make it fir into the remaining PCI-e socket. (I hope)

As seen here, there's only 3 GPU's running. This is the cards from KVA2, GTX750/GTX750Ti. Let's see if the 4'th card fit and if it will run usind the linux driver

GPU Temp - # nvidia-smi* :



As seen below, all 4 GPU's are running on "Test System 1 aka Wellington":







*Status "Test System 3" aka "Beaufort" - 11.12.16 11:39pm:*

OK, MSi GTX750Ti has now been installed. As seen below, the temperature for card 4 (d3) went up right away. You can compare it with card 2 & 3 (d1 & d2) which is number 2 and 3 from the top at the temperature images. "Temperature/GPU Current Temp etc. etc." The reason we are comparing with those 2 cards and not the 1 card (d0), is that this GPU is currently running another type of workunit (SETI) This type isn't running the GPU "as hard", it doesn't get as hot! Showed it in my post above.

Temperature/GPU - Right after installation:



Temperature/GPU - 5 min.:



Temperature/GPU - 10 min.:



The temperature on GPU 4 (d3) the new MSI GTX750Ti is stable for the time, but a bit higher than the others. This must be because this card hasn't got any fan's. At the same time, the 4'th card is placed in a not so smart place. The 4'th fan, as shown on the images in posts above, is pointing directly at the PSU. These 2 situations may add to the increased temperature. All fans are, as you might remember, replaced with PAPST Industrial fans. But, when the airflow isn't pointing the right way, it may be less efficient. When the "top of the case" is on, and the chassis is completely closed, the temperature drops a degree., not more. So, there may be more cooling to find, constructing some kind of "wind/direction-thing".. Suggestions will be greatly appreciated.

Here you can see GPU 4 (d3) run, and that the workunit is the same type as GPU 2 and 3 (d1 & 2):



_A 24 hour test has been initiated!!!_

PLEASE NOTICE!! IMPORTANT!!

When running these new settings and GPU's etc. it's vital to remember to modify the "Watchdog* scripts" !!! In this case we are setting the "WatchdogGpuTemp.sh" script to warn at 65 degrees Celsius, and shutdown the system at 70 degrees Celsius !!

Here's my current settings:

Add the GPU to the script! Red arrow! Or just do what the info in the script tells you to do











Modify the command to degrees after your own chosing: Red arrow or follow my info in the scriptheader. Set it low to begin with. Better to get notified a few times and learn from this than everything else:



_Ongoing work...._



.
.


----------



## DanHansenDK

*Status "Test System 3" aka "Beaufort" - 14.12.16 05:13am:*

Test still running, looking pretty good so far. No warnings issued by the scripts, and the log confirms they are working, so thats all good. Boinc results is looking good. I'll list the results when all 3 finished test systems are running and results is reliable.









Temperatures is peaking at 67 degrees Celsius. Lets see what happens after one of the large top-mounted Rack fans is replaced. And after that, what the result will be after the the Rack-mounted 4-Fan 1U section is installed









Code:



Code:


Temperature
        GPU Current Temp            : 58 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 54 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 50 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
    Temperature
        GPU Current Temp            : 66 C
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A

* A large fan in the top of the Rack-Box broke down. Began with noise and the it just shut down. I've retrieved a new one. I'll show images later. At the same time I bought a large 1U Rack-mounted Fan-thing, with 4 large fans running at 220 volts like the ones in the top of the Rack. This to get some airflow from the bottom of the Rack to the top, where its vented out though a 125mm pipe. Anybody who needs to know about how you do this without the need for robbing a bank first, just let me know and I'll illustrate it for you.

*Status "Test System 2" aka "Halifax" - 14.12.16 05:37am:*

OK, I've picket up a new Industrial 2U 510watts PSU for Beaufort. It'll be installed and then we'll see. Damn, I can't wait to see if it runs.









.


----------



## DanHansenDK

*Status "Test System 2" aka "Halifax" - 15.12.16 02:44am:*

OK, let's begin... Look at this, it may have to do with the fact, that the 300watts PSU doesn't have the 24 pins and/or the 8pins sockets to fit the Mobo...







_Ongoing work......._

...


----------



## DanHansenDK

Anyone!?!?

How do I post a post here, that statys e.g. in the top of the whole thread??? At page 1 f.ex.?? I would like to post the finished scripts, todo's etc. now. I think things are stable enough now, so that you and others can make use of it









Dan


----------



## navjack27

could i do something like this with my old rig that has no case or hard drive at the moment, with just a usb stick?



5775c and a 390x with 32gb ram. i'll eventually get windows on it once i get a ssd for it, but in the mean time for a lil Christmas project. i think a headless Linux box would be neat.


----------



## Tex1954

It's been my experience that a USB stick is painfully slow and will affect system I/O speeds and cost you a lot of points on some projects....

I use SSD's on everything. So far, the Corsair SATA-1 Refurbs seem to work fine 9 out of 10 times, the Intel SATA-1 Refurbs seem to work with some systems and not others, the Kingston V200/V300 seem fine in all setups. The later version Corsairs also seem okay in many setups.

However, with prices dropping so fast, you can get a NEW 120G or 60G for dirt cheap. I haven't tried those KingDian (or whatever) drives yet, but am tempted... 60G SSD for $28? Pretty good!


----------



## tictoc

Quote:


> Originally Posted by *Tex1954*
> 
> It's been my experience that a USB stick is painfully slow and will affect system I/O speeds and cost you a lot of points on some projects....
> 
> I use SSD's on everything. So far, the Corsair SATA-1 Refurbs seem to work fine 9 out of 10 times, the Intel SATA-1 Refurbs seem to work with some systems and not others, the Kingston V200/V300 seem fine in all setups. The later version Corsairs also seem okay in many setups.
> 
> However, with prices dropping so fast, you can get a NEW 120G or 60G for dirt cheap. I haven't tried those KingDian (or whatever) drives yet, but am tempted... 60G SSD for $28? Pretty good!


I've had good luck so far with the 120G SanDisk SSD's. At $40/drive they are pretty tough to beat.


----------



## tictoc

Quote:


> Originally Posted by *DanHansenDK*
> 
> Anyone!?!?
> 
> How do I post a post here, that statys e.g. in the top of the whole thread??? At page 1 f.ex.?? I would like to post the finished scripts, todo's etc. now. I think things are stable enough now, so that you and others can make use of it
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Dan


You can edit your first post, and put in all the finished scripts, todo's, etc. Since the OP is more than one year old you will probably need to message a moderator if you don't see the edit button at the bottom of the OP. After one year of no activity in a post, the post gets locked.


----------



## DanHansenDK

Quote:


> Originally Posted by *navjack27*
> 
> could i do something like this with my old rig that has no case or hard drive at the moment, with just a usb stick?
> 
> 
> 
> 5775c and a 390x with 32gb ram. i'll eventually get windows on it once i get a ssd for it, but in the mean time for a lil Christmas project. i think a headless Linux box would be neat.


Sorry about the late response... Moved to a farm or what its called.... Getting speedy internet is a task here







Damn.....

Hi,

Yes of course... We usually just drills holes i a squared piece of plastic/pvc 5mm thick... Use a platform from a case to get you holes right. Then use cobber things for distance









KR
Dan


----------



## DanHansenDK

Quote:


> Originally Posted by *Tex1954*
> 
> It's been my experience that a USB stick is painfully slow and will affect system I/O speeds and cost you a lot of points on some projects....
> 
> I use SSD's on everything. So far, the Corsair SATA-1 Refurbs seem to work fine 9 out of 10 times, the Intel SATA-1 Refurbs seem to work with some systems and not others, the Kingston V200/V300 seem fine in all setups. The later version Corsairs also seem okay in many setups.
> 
> However, with prices dropping so fast, you can get a NEW 120G or 60G for dirt cheap. I haven't tried those KingDian (or whatever) drives yet, but am tempted... 60G SSD for $28? Pretty good!


Hi Tex,

SSD's are fast and cheap in energy, but!! I've experienced hardware resets... lost mbr's or what its called. This may just be because of a late night fast reboot. of course I did this myself, but I never had the problem with the optical drives.

What du you think about that???









KR
Dan


----------



## DanHansenDK

Quote:


> Originally Posted by *tictoc*
> 
> You can edit your first post, and put in all the finished scripts, todo's, etc. Since the OP is more than one year old you will probably need to message a moderator if you don't see the edit button at the bottom of the OP. After one year of no activity in a post, the post gets locked.


Hi TicToc,

Thanks... I'll try that... Or I just end the thread with the finished ToDo.. It's about that time I guess









KR
Dan


----------



## DanHansenDK

Hi,


To end this, I'll post the power differences between the to GPU's GT640 & GTX750ti (Both Low Profile Cards)


Asus GT640:

Volts AC: 220
Amps: 0.96
Watts: 214.5

24H power consumption: 5.2 kW/h

GTX750ti*:

Volts AC: 220
Amps: 1.27
Watts: 281.8

24H power consumption: 6.6 kW/h

* 2x GTX750ti & 2x GTX750

Chrucnhing data using these cards, clearly shows the importance of choosing the right card! GT640 crunches around the half of GTX750ti.. This I'll have to look up, can't remember the exact numbers... But, it's pretty close to double using the GTX750... And looking at the power consumption, it's unreal... 
I never thought this rig could run using a 300watts PSU! But look at it... 281.8 watts... It cut's the cost of the PSU down to the half 

KR
Dan


----------



## DanHansenDK

DanHansenDK said:


> Hi,
> 
> 
> To end this, I'll post the power differences between the to GPU's GT640 & GTX750ti (Both Low Profile Cards)
> 4x GPU's running at full load.
> 
> 
> Asus GT640:
> 
> Volts AC: 220
> Amps: 0.96
> Watts: 214.5
> 
> 24H power consumption: 5.2 kW/h
> 
> GTX750ti*:
> 
> Volts AC: 220
> Amps: 1.27
> Watts: 281.8
> 
> 24H power consumption: 6.6 kW/h
> 
> * 2x GTX750ti & 2x GTX750
> 
> Chrucnhing data using these cards, clearly shows the importance of choosing the right card! GT640 crunches around the half of GTX750ti.. This I'll have to look up, can't remember the exact numbers... But, it's pretty close to double using the GTX750... And looking at the power consumption, it's unreal...
> I never thought this rig could run using a 300watts PSU! But look at it... 281.8 watts... It cut's the cost of the PSU down to the half
> 
> KR
> Dan




Well, talking to myself here... Anyway, a nice little thing has appeared  (sorry about the danish description.) It's not that higher number of cores, but 100+ still makes a difference. Will it run on a 300watts PSU ??? Its right on the edge I guess 
Notice the watts... 75watts, 768 CUDA cores !?!?!?!
https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1050/

Generelt
Enhedstype	Grafikkort - low profile
Bus Type	PCI Express 3.0 x16
Grafik engine	NVIDIA GeForce GTX 1050 Ti
Kerne clock	1290 MHz
Boost Clock	1392 MHz
Maksimum opløsning	7680 x 4320
Maks. antal understøttede monitorer	3
Interfacer	DVI-D (dobbelt link)
DisplayPort
HDMI
API-understøttet	DirectX 12, OpenGL 4.5
Egenskaber	MSI Solid Capacitor, Dual Fan Design, NVIDIA Adaptive Vertical Sync, Military Class 4 Components, NVIDIA G-Sync klar, NVIDIA GameWorks, Nvidia SHIELD ready, FinFET-teknologi, HDCP
Hukommelse
Størrelse	4 GB
Teknologi	GDDR5 SDRAM
Effektiv clock hastighed	7.008 GHz
Busbredde	128-bit
Systemkrav
Påkrævet strømforsyning	300 W
Diverse
Strømforbrug ved drift	75 Watt
Med software	MSI Afterburner
Overensstemmelsesstandarder	MIL-STD-810G
Bredde	3.5 cm
Dybde	18.2 cm
Højde	6.9 cm
Vægt	290 g


----------



## tictoc

Thanks for keeping this thread going. :thumb:


The 1050 will use a similar amount of power as the 750ti, and perform 40%+ better than the 750ti. I have a 1050ti without external power and it crunches great while sipping power.


----------



## DanHansenDK

Helloooo friends,


OK, I've invested in a wind turbine, a lot of hardware from over there (the states) a solar panel setup and a new card GTX1050ti 4Gb. I'm expanding the project a little. Work has been crazy and I've been totally occupied. I sorry friends. 

OK, I'm building this sun and wind power setup. First a 4x 300 watts solar panel config and 1 2kW 48volt A/C 3 phased 40 Amps wind turbine with 2 battery banks. Battery bank 1 is 6x 100A pretty special deep cycle batteries and battery-bank 2 will be lithium-ion 18650 batteries harvested (mostly) from laptop batteries. We will start up with 24 volt banks and then I will go from there to a more efficient 48 volt setup. I've got myself another RACK for this 48 volt battery-bank, but this I will build last. I've bought a 3000watt APC PSU in Belgium, which I will use as a inverter and this will run only on the 48 volt battery bank li-ion. I just got the charge controller last week, from the states as well. a midnite classic 200 which apparently is the best on the market for wind turbines and DIY maniacs like us. It's a pretty nice peace of hardware, that's for sure. 

I'll document it all. I'll do videos from start till the end. But I'll write inhere as always, first. This is where it all started for me 

The first config will be a Grid-Tie setup, 24 volt battery-banks. A wind turbine needs a load at all times. And of course I've gotten myself som monster resisters (like those we use in elevators for braking) so that when then battery-bank is full, the load is being dump'ed. This dumpload is directed out into these resistors and the overproduction of electricity will be burned of as heat. YES, I know! And of course I've got a plan here. This dumpload will instead be used for running mining rigs, so that the electricity is being used right!

I'm currently running GTX 1050 ti 4Gb on a windows system, testing it for performance mining and temperature. I'm rebuilding "Test system - Beaufort" to run with 4x GTX 750 ti to run on Linux server as a headless rig, as usual. 

Actually I'm not very satisfied with the performance of the new card Bitcoin-wise. It's several times faster than GTX 750, but wh are getting nowhere with the bitcoins that way. It's a dollar a day. And yes, free energy - so there's no expenses.


OK, if there's any of you that knows how to install and run e.g. Gminer, T-Rex miner, CCminer on Linux server CLI, I would be so happy for the help. I've used a lot of time figuring all this out and as you know I've succeeded in getting several GPU's running on Linux rigs headless, so the systems are ready for GPU mining. 
I know about wallets, I've got one of cause. And one for testing. And I know about pool mining too. I've prepared a wallet for full node mining, and the linux rigs will be full node in the end. That's the goal.

The system I'm building here will begin as a Grid-Tie system. I will build a Off-Grid system too, to have several rigs running there, and then finally, the Grid-Tie'd system, will be rebuild into a Grid-Tie A/C coupled system.

I'm hoping for some help with the mining stuff on Linux.

Here's the status on GTX1050ti running 100% mining the T-Rex GPU Miner v.0.9.2 on CUDA 10

BTW. We will need CUDA 8 on Linux when mining on this card. Maybe there's a new LP card coming soon?

Are we going to use only 1 larger card in the future??? What do you say guys? I've bought some 90 degrees adapters for such a card to be installed on our test systems. But I like the Low Profile GPU's. It's more expensive, I know, but is it better or not? Building a system to compare with shall we?


----------

