In my previous post I reported howto build and install TensorFlow and horovod from sources and howto setup a BeerWulf (BeoWulf) cluster. Building this BeerWulf cluster is though a good exercise to make a (resilient) system of a commodity hardware, however, it is not the most efficient way for a practical purpose (in my case: for creating an AI model, which helps me to pick up stocks). In this post I consider the hardware alternatives in the sense of making them both as efficient and as cheap as possible.
What do we need for a powerful DeepLearning model?! Well, first of all, of course, the BigData and then - in order to process them - a lot of RAM and CPU power (better GPU, if you manage to harness it properly: we come to GPU further in this post).
Probably the cheapest way to get RAM and CPU (but not GPU) is to buy a used server on eBay: new servers are f*cking expensive but the used one can be occasionally bought cheaply, since the good companies renew their hardware proactively. So I had luck to buy two servers from serverzerogmbh.
Why two? Well, because one is not sufficient and three is too much. In other words, I tried to buy all hardware in pairs because every hardware has to be tuned and the return on tuning efforts doubles if you have two identical parts. Further you can compare two entities to each other and quickly determine whether one of them is out of order. So I got two SuperMicro servers that are almost an out-of-box solution for DeepLearning.
Almost... First of all they are very noisy, but it is not a big problem: thanks to an IKEA table LACK 14729 (whose secret purpose is to be a server rack) you can easily move your cluster to a kitchen or a bathroom. A more serious problem is that you cannot simply rely on the standard linux installer (Ubuntu 20.02) since though it does install but the GRUB (boot manager) fails to manage the RAID properly and fails to boot. There are some solutions on StackOverflow but I did it easier: just installed an extra SSD drive, which is recognizes by GRUB without a problem and which additionally accelerates the boot-up process. Interestingly, you can get this SSD cheaper on Amazon than on eBay.
Note that although (besides a RAID controller) there are enough free SATA connectors, there are no free SATA power plugs: all you have are just two power plugs for floppy! But no problem: you will find proper cable on eBay (though it took a while to find such an exotic adapter).
But their was another problem: one server booted-up properly but the other one beeped like crazy. The SuperMicro mainboards have a lot of beepers (PC-Speakers) and because of the fan noises I first even could not distinguish, where it beeps. So I disconnected all fans (NB! dangerous, since the system must be properly cooled) and finally localized the beeping source: it was by the power unit. As a matter of fact these servers have two power-supply units and a server can run just on one (like a robust aircraft can fly just on one turbine). But of course running with just one of two engines is not a good state, so the system beeps alarm.
So, having found a problem, I had a look on eBay, what the power supplier costs. The cheapest current price was €40 (by an Israely seller), so I reclaimed (and immediately obtained) €40 from serverzerogmbh. Note that power supplier shall match exactly you server chassis: just have a look how different are PWS-702A-1R and PWS-703P-1R.
So for less than €500 I have two servers, each 7.5 TB HDD (RAID), 144GB RAM and two XEONs 2620v2 (each 6 kernels=12 threads). There is an upgrade potential to high-end XEONs, which will allow me to almost double the CPU power. However, I am not going to do it so far (let us wait until the obsolete Intels v2-CPUs will cost virtually nothing on eBay). An additional good thing is that I can put (almost) all RAM in one server: 256 GB RAM + a Swap on SSD shall be sufficient to crunch the high-frequency historical stock price data (it is my long-term gonna-do).
The second (paired) unit I tried was based on TERRA D3348-B13 GS1 Fujitsu 2011-3 DDR4 ATX Mainboard, which I got with a proper CPU cooler for less than €100. A proper CPU cooler is really important: first of all it alone can cost more than €20 and you will get in a big trouble if you install an improper one. I hoped the CPUs and RAM will be interchangeable with my above-described Supermicro servers but ... Intels CPU notation is rather counterintuitive: LGA2011 and LGA2011-3 (or v2. vs. v3 vs. v4 XEON CPU lines) are quite different (and totally incompatible) things!
This TERRA main board is designed for LGA2011-3: low end XEON v3 processors (6 kernels = 12 threads) can be got for €20. So I first bought two Xeons E5-2620v3, checked that everything works and then upgraded to the high-end CPUs Xeon E5-2680 v3 12x2,5GHz-3,3GHz 12 Core CPU LGA2011-3. Note that the old CPUs will not be thrown away: I am awaiting a dual-CPU MainBoard from China, which I bought just for €125, so I will install these CPUs in this mainboard: 2*2*6=24 CPU threads will be enough.
Contrary to the out-of-box SuperMicro servers, I had to build my TERRA Fujitsu system from ground up. Here is important to choose a proper tower.
In theory, any (E)-ATX Tower will do and used both Makashi Enermax PC Gaming ATX Case with Tempered Glass and a noname (but still very good) ATX Tower, which I took from my old computer (an AM2+ old guy that served me more than 10 years and has just recently passed away). The shortcomings of Makashi is the lack of DVD-slot but for a server it is not a big problem. Its advantages are much more numerous:
1. It is covered both from top and bottom with a metal net, which captures the dust and allows an efficient ventilation.
2. In an E-ATX you generally have enough space for such things like a TESLA GPU (and cool/ventilate them properly). An extra fan that comes with Makashi tower is pretty quiet, compared to the old ATX tower much more quiet even though Makashi has one fan more.
The only irritating problem: I hoped that I will be able to install two Tesla GPUs (in both PCIx16 available slot) and additional PCIx1 GTX 730 card for video output. GTX 730 has CUDA version 3.5, just like Tesla K40 do. So it would be a perfect match but alas... BTW, a single slot PCIx1 card would do.
Ah yeah, almost forgot: E-ATX is too big even for my supertrunc, whereas ATX fits.
As to the Power Supplier, I just looked at a powerful and cheap one like 1200W Active Gaming PC ATX 12V 2.3 PFC. Although a bit noisy (likely is have to be by 1200W) it is good for its price. I wished it had two CPU 4+4 Connectors (again, for 1200W it would be adequate). For what? Well, Tesla K40 still can be supplied with 8pin + 6pin PCIe but Tesla K80 does require a CPU 8 pin connector! Ok, waiting for a delivery of cable 2*8 PCIe to 8 CPU Pin.
The last but not least (and actually the most expensive) part is RAM. This TERRA mainboard is designed for DDR4 Ram (so I cannot just take some DDR3 Memory from my SuperMicro service). Very important: with XEON only a special ECC Server RAM will function (with iCore a conventional non-ECC RAM shall do but I have not tested). However, on a secondary market the ECC RAM (which does not fit into commodity PC) is cheaper than the non-ECC! So I first got 64 GB for €200, then another 64 GB for €155 and then once more 64 GB for €135.
Remarkably, it was very hard to find the info on approved memory. Finally Google helped but
So in total each server cost me about €700. This is more expensive but if we take into account the energy efficiency, lower noisiness and more modern CPU / RAM (XEON v3 with DDR4 is about 30% faster than XEON v2 with DD3), it is likely a better option.
And just look at current prices at the primary market (once again, salve CSL)
Well, these offers have a modern video card that might be critical for gamers and miners but not necessarily for me. Apropos GPU: instead of buying the "top-notch" modern GPUs I'd rather buy old good and cheap on secondary market Tesla cards.
So far I have two Tesla K40 (one for €180 and one for €130) and two Tesla K80 (€200 each). For the latter I cannot say anything yet since they were delivered with a non-standard power adapter thus I have to wait until the proper cables will be delivered, but as to the Tesla K40: I am pretty disappointed. (On my models) it is not really much faster than a high-end XEON ... and I still cannot fix the desynchronization problem!
I still hope I will be able to harness the GPU power later but so far not so good.
Last but not least I would like to warn everyone: take care, do look on sellers Bewertungen (Feedback)!!!.
I do not even mean the direct rascals like this, PayPal Käuferschutz (buyer protection) shall refund my payment (update 2021.08.16: yes, they did)!
There are also the smarter asses, like this one, who sells used goods as new (additionally, delivers an HD monitor instead of QHD) and then refuses to take over the return shipping costs.
Are the eBay's fraud managers unable to understand that 2% of negative feedback is indeed VERY much?! (compare to 2.1% of COVID worldwide mortality rate). If such guy is one of the best, then definitely one of the best from the worsts. But eBay does not care...
And as apotheosis: I once had a "luck" to win an auction for Tesla K80 for €100. What the seller did?! He cancelled the deal, which is a direct violation of eBay rules! I contacted him and demanded to send the card... he sniveled that he accidentally has burnt out the hardware ... and in several days I see him offering three(!) Tesla K80 on eBay. I reported the case by eBay ... not only did eBay nothing, but also further promoted the offers of this guy in my recommendation list!
Well, despite of these facts buying on eBay saved me quite a lot of money but if eBay further neglects the sellers quality assurance in such a way, then ...
P.S. you may probably think something like: yes, the author is likely a technical nerd, but does it really relate to letting your money grow?!
Yes, it does! The goal is to further improve my neural network for stock picking, which promises about 12% annual return. Even if I improve it "just" by 0.5%, the hardware investment will pay off in the long run!
Don't you believe? Check it with our savings plan simulator!
FinViz - an advanced stock screener (both for technical and fundamental traders)