What happened to nightly builds?


#21

The KiCad Windows nightlies http://downloads.kicad-pcb.org/windows/nightly/ are 740MB files served from the downloads server of kicad-pcb.org, which traceroute indicates is in France. The bandwidth for serving these is a separate matter from the failure of a developer’s Windows machine https://lists.launchpad.net/kicad-developers/msg31003.html .

I tried downloading the 2017-09-19 nightly several times with Firefox. Perhaps because I am in Australia, it was taking a long time and the process eventually stopped. I guess Firefox had some timeout. This happened several times, so I used wget from my virtual server at Hetzner in Germany. This proceeded quickly and I then used WinSCP to get it home from there.

If anyone is having difficulty downloading this nightly, you may be able to get it from this copy more easily:

http://www.firstpr.com.au/temp/kicad/kicad-r8510.dddaa7e69-x86_64.exe

The 2017-09-18 one is

http://www.firstpr.com.au/temp/kicad/kicad-r8504.3b7fbad1b-x86_64.exe


#22

I never had problems with downloading Windows Nightlies to Malaysia.
Is Australia having network cable or routing problems?
The Ubuntu PPA of the libraries is another story, taking a long time


#23

Current status: new box is ordered (took a while to hammer out something that the supplier is willing to support for five years), and space in the colo facility has been reserved as well. We’re waiting for the SSDs for the VMs still though.

Timeline:

  • October 27th: SSDs should arrive
  • October 28th–29th: Burn-in testing
  • October 30th–31st: Shipping
  • November 1st: Setup

#24

Are you guy(s) funded via the CERN donation or is that separate?


#25

No, that is separate. CERN uses that money to commission new features. Using donations for recurring payments is a logistical nightmare unless it’s something small like a domain name.

The build server needs a bit of horsepower and connectivity, which puts renting a dedicated box out of the sensible range for donations anyway. We have a few offers for machines, but none of these work well because the build process needs six hours of CPU time, downloads about 1 GB, writes 50 GB to disk and uses 700 MB of RAM per thread.

A lot of that could be solved with better software (e.g. caching of results), but all we have right now is Windows+MSYS2+Java+Jenkins.


#26

Just posted on developer’s list from the developer volunteering his time for the Windows builds, Simon Richter:

New server is ordered, but the SSDs seem to be difficult to get right
now. The hardware will probably be ready by next Friday, then the
selftests will run over the weekend, then shipping to the colo facility,
so if all works out, November 1st will be setup day.


#27

LOL. Check the profile of the poster above your post…


#28

[quote=“GyrosGeier, post:25, topic:8167”]…but none of these work well because the build process needs six hours of CPU time, downloads about 1 GB, writes 50 GB to disk and uses 700 MB of RAM per thread.
[/quote]

Welcome to the USER FORUMS! We, well most of us, really REALLY, REALLY appreciate this mostly awesome Open Source software!

However, there will be grumblings when things don’t work as expected.

How much does this new machine cost?


#29

All in all, about 10k€, but KiCad nightlies only need a tiny slice of that machine since that is a once-a-day job that is done in about twenty minutes, mostly thanks to the RAID controller adding a more aggressive cache layer than could be implemented in software.

The rest of the time, it will mainly run database and computation workloads for my company, so there is a reason it is overpowered. :slight_smile:


#30
  • What about distributing nightlies through Torrent? Works well and probably lots of us would contribute.
  • Separate libs. I often install nightly and always untick all the options of installer.

#31

Again the distribution is not the problem here. It is building the software. (Converting source code to the stuff a computer does understand.)


#32

What exactly so challenging about compiling code? Thanks, by the way, for explaining to us in laymen’s terms about what this mysterious process is all about. While you’re at it, can you also explain to me what exactly RAID controller and a high bandwidth link does?


#33

This is not easy to explain in such a way that someone without knowledge of what is going on can easily understand it. (And i need to confess that i am not as knowledgeable about compilers as would like. I fear i forgot a lot about this topic already.)

Summary from an answer on stack exchange

  • Many files hold information that needs to be interpreted by the compiler. (Lots of disk read access operations -> very slow)
  • parsing c++ syntax into the internal data structure used by the compiler takes some time.
  • optimization: There are a lot of operations that can be improved by the compiler such that the performance of the resulting program is increased. Optimizing is a hard problem. (Needs lots of computation power)
  • creating the assembler code (machine code) takes some time as well.

And as far as i found out this server is a bit overpowered if it would be only used for compiling kicad. (The owners donate a bit of computing time to compiling kicad.)

A raid controller manages systems where you have more than one hard disk. There are different options for how these can be managed. All have their own benefits and drawbacks. (below a simplified summary.)

  • One option is to split out your date across all disks (fast but if one disk fails you loose all your data)
  • Another option is to have all data mirrored on multiple disks (low chance to loose data.)
  • And there are other options where you split the data but also include recovery information in case a disk breaks. (depending on raid level, 1 or 2 disks are allowed to fail before data recovery is not possible anymore.)

I would guess this has to do with the fact that this server is normally used for serving a database. (But this is something where i really know next to nothing.)

The term bandwidth has its routes in signal theory. (describes how wide of a frequency band is taken by a signal.)
In computing it is more or less used to describe network speeds. (How much data can be transferred per time instance.)

So from that i would guess a high bandwidth link is “just” very fast network access hardware.


#35

Compiling a source file is easy.

Building a large application suite, with all of its associated dependencies, and making sure these builds and the subsequent packaging do not fail, is somewhat less easy. Usually the build servers fetch the sources for all of the dependencies and build them, and then the application sources are updated from the head of the repository. It’s all automated, with error reporting and all the things that are necessary, but it does require attention. it’s not as simple as ./configure; make; make install, much as they’d like it to be such.

Of course, you are welcome to configure a build server and show us how much smarter you are than those lousy developers.


#36

Official note

…in case anyone is wondering why the thread reads a bit differently.
I edited some posts in this thread to remove offending tones by a singular user.
Thanks for staying polite and civil @ everybody else involved.


#37

That’s a deep and complex question, but a full answer is probably too long for this forum :slight_smile:

However, your question does contain a clue. You missed out the word “is”, a fairly important verb wrt to parsing the meaning. As a human, I can immediately see the error, mentally correct it and carry on. However, computers are stupid, extremely stupid (but fast). They would likely just say “Syntax error”. You then have to figure out what the error was and how to fix it, before the computer can even attempt to answer the question.

Complex software also has a set of instructions telling the computer how to compile the software, the build instructions are also a kind of programming language, so you have the same problem of telling the computer exactly what to do, and the computer failing to understand when you get the smallest thing wrong.

Imagine talking to a person who whenever you made a typo, a punctuation or grammar mistake etc, just said “Error”.

tldr; computers are stupid.


#39

So what exactly IS the problem? I never said that there is no human in the loop. The original statement that I was puzzled by was “…distribution is not a problem but building it is.” What exactly is that mysterious “problem” in building the software that is related to the current predicament? So far all I’ve heard was that it is difficult because it can’t be fully automated. “Difficult” is not a problem


#40

You have to ask the people who run this, but usually don’t frequent these support forums, but hang around in their own world - the developers mailing list.
I’m amazed @GyrosGeier showed up to be honest.


#41

probably you missed that the ‘problem’ is to have the building hosted on-line for free…
this is available as ‘jenkins’ for Linux or OSX but NOT for windows… that is the problem here…
Maurice


#42

We use Jenkins (on a Linux box) to remote-control a Windows VM that does the actual build.

This works reasonably well, but requires that the connection between Jenkins and the agent remains up for the entire duration of the build, otherwise Jenkins will mark the build as failed and the agent will kill everything. File transfers between Jenkins and the agent go through an RPC protocol over the same channel.

Jenkins isn’t built for this kind of setup, really — they expect all machines to be on a LAN, with low latency (copying one file requires a full round trip, and the next file is not started before the last one is acknowledged) and high throughput (otherwise, large file transfers clog up the RPC pipe, causing the periodic “ping” requests over the same connection to time out).

So, the box needs a really good connection. I’ve had the builds running on a VM on my home DSL for some time, where about 50% of builds would succeed, and every attempt would take six hours. CPU horsepower isn’t that important, even an i3 can do it in about 2.5 hours wall clock time, but rushing through in ten minutes will reduce the chance for failures even more.

Also, the build process could be a bit more efficient. Right now, we always download the latest state of the libraries, unpack them, copy them to the installation directiory, repack them, copy the archive into the installer, and then sign the installer, which requires another copy. That is where most of the build time comes from on the small boxes, they simply don’t have the I/O bandwidth for that.

This is where the RAID card comes in handy — it has a few GB of RAM that has its own battery, so it will just accept the write of the footprint archive in one big transaction, report that the data has been written, and then send it to the disks in the background, while the build process does the next step, which conveniently uses the same data that is still in RAM, dropping the wall clock time for unpacking the archive from a few minutes to a few seconds.

I have an item on my TODO list to make the build more efficient and also allow separation of binaries and library, but this will take some time to get right — there are a few frameworks for that already, but none of them fit exactly (like Jenkins, which makes 90% of the job easy, and the remaining 10% would require a full rewrite of Jenkins from the ground up).