New project file format

Ok, now it’s easier for me to give some opinions.

First, you said “tons of atomized configuration files”. I accept hyperbole, but there’s still only very limited amount of configuration files, and I think you already named those which belong to a project. So…

  1. Performance. Opening 100 files instead of 1 is much faster.

Do you really have benchmarking data which tells that on a modern hardware and modern OS there would be noticable difference between opening 100 instead of 1 configuration files, compared with the time spent on actually parsing those files? Let alone less than a dozen, which is more realistic for KiCad project configuration? I don’t think this argument stands.

  1. Disk space. Small files takes a disk space unit. Having tons of small file wastes disk space.

This argument is even weaker. We are talking about a few files, not thousands. And we are talking about systems with Gbytes of drive space, not about embedded systems. How many bytes the FS metadata takes? Any benchmarking?

  1. Maintainability of the code. Having tons of functions to deal with tons of files is much more complex than having just one.

Again, I’m not against rhetorical exaggeration, but we are talking about a few files and how many, two? file formats. What are those “tons of functions”? The amount of files has nothing to do with how many functions are needed to read them. The only drawback is that the files must be named in the code. But it’s far from “tons of” anything.

  1. I’m tired of explaining to new users at work which files must be included in a repo and which ones don’t. More files, more problem.

I would say this is more a problem in KiCad documentation, but also in your company’s workflow. For any reasonably good version control system you would have one centralized ignore file which can be copied or is used automatically. There shouldn’t be need to explain it again and again. And here’s a new idea: add ignore files to KiCad itself which can be automatically added (as an option) for a new project.

Here, again, the problem is something else than the amount of configuration files.

  1. Simplicity, the KISS principle.

“Make it as simple as necessary, but not more simple”. There are reasons for dividing the files. Complexion is added only if necessary. If there’s too much complexity because of, say, historical reason, it’s possible to reduce it. “The KISS principle” is easy to throw in but unfortunately reality is mostly more complex, and the principle is not an argument per se. You should argue why a certain implementation is too complex considering the needs and benefits.

  1. When you enter a directory and find tons of files you lose time.

That would be fixed by a better folder hierarchy, not by reducing configuration files. I would like to see subdirectories in a default new project, at least one for “configuration”.

3 Likes

Lots of small files as oppose to one larger file does come with a performance hit since there are multiple open-read-close calls
This is further compounded if there is disc encryption and also compounded if these files are across the network since windows discovery is poor.

That said, it would be negligible for a couple of files and more of a serious issue if you are dealing with something like MATLAB (millions of 1k files).

From reading this discussion i get the feeling that the files are not the real problem. They might just be the visible symptom for what @set really wants to be changed. I kind of get the feeling they would prefer defining a single search path and then letting kicad find the libs within them similar to how eagle works.


This is the main culprit of why i think that.

If i would need to guess then @set is involved somewhere where there is a high flux in people committing stuff. The comment about needing to explain this very often kind of backs this up as well. This would indeed make it harder to have a centralized setup and expect new members to invest time into system setup.

(In my current job i needed basically a month to have everything up and running. Setting up KiCad is nothing compared to setting up a full toolchain for embedded development and getting access to all required svn, git, network drives and jira instances plus all licence keys.)

The solution for the problem described might however be a lot simpler than getting rid of local library tables. If KiCad includes all its library assets inside its files (like will be done with version 6) then there is no need to have valid libs for a review. Until then the archive project plugin by @MitjaN should solve these issues nicely. See Project and libary setup for sharing and collaboration

3 Likes

On Windows opening lots of small files is hit by antivirus scanning and on conventional disk drives by disk directory fragmentation

6 posts were split to a new topic: Only one core involved in generating footprint info cache

Kind of the whole point of s-exp is that they are so simple to parse there is almost no support needed to do it. They come from a time when computers where just barely out of the electromechanical age of computing and processing time was a premium so you end up with a simple elegant solution.

I dislike that this idea of proliferation of formats… it would be far better if all native kicad formats were s-exp, if people can’t read the dang filename, and the text in the file that says what it is… why are you trying to fix a problem that doesn’t exist by causing more problems???

Now you cannot parse all native formats with a single parser… I’d really prefer to only need to know one format when dealing with KiCad, and I’d prefer that that format be near plain text, which JSON is not quite, YAML is nicer in that respect perhaps nicer than s-exp. The only real problem is that parsing YAML or JSON is harder than s-exp which is trivial.

I’m just pointing out changes for the sake of changes… are not beneficial. Take a Look at Linux… eternal code churn so much so that developers can’t even deal with it and tap out. Take a Look at any of the BSDs the code just doesn’t change that often and when it does it is consistently done and for good reasons.

I don’t think I have ever seen that happen in practice with any software. Perhaps opening the read/write might trigger that but there is no reason that opening many files read only would.

Libaries are for this, to help you do a better job without needing to reimplement it. YAML has at least 4 implementations for C/C++. No problem here.

BTW: my question was why JSON instead of YAML. If the project used s-exp for all, and avoided creating two new config files for each release, I didn’t make any comment. Is not s-exp (which I don’t like) the problem, but the endless proliferation of files and file formats.

1 Like

I 100% agree.

more text here to comply.

KiCad was a particularly bad example. Opening PCBnew for the first time used to take minutes for me on an old PC. I found that it was the antivirus and disk directory (not file) fragmentation. I was installing a new version of KiCad every few days. I used to have to compact the directories periodically. The Sysinternal tools showed what was going on
The problem largely went away with a SSD drive PC and a different antivirus.

Sounds like a bad AV… rather than a problem worth working around.

Windows has a background file defragmentation feature these days. So far as I know, it does not defragment directories, so KiCads very complex footprint and 3D model directories can get scattered all over the drive. Fortunately SSDs make this irrelevant and also discourage file defragmentation as it causes extra disk wear

Fragmentation is irrelevant to wear on HDDs it just slows them down. And is completely irrelevant on SSDs… so yeah again never a problem other than for performance and if it wasn’t for your AV being lame it wouldn’t have been an issue then. I’ve ran KiCad for years on SSDs and HDDs (even windows XP machines on older releases) no issues.

You should look at using yaml instead of json. I have used json for several projects now for configuration and the problem with JSON is that it doesn’t allow comments inside the structure. Yaml is much more forgiving with comments which can be very useful in data files such as config files and project files.