Fate of the sweet symbol library format

greghuangqi · October 15, 2021, 2:52am

Hi,
Any one know what is the plan for put 1 symbol in 1 file(sweet symbol library)? It looks the V6 still uses one file to hold a symbol library.
this is a relevant link about this topic.
https://www.mail-archive.com/kicad-developers@lists.launchpad.net/msg33692.html
We can not just have .pretty but no .sweet
TIA

eelik · October 15, 2021, 5:52am

It was kept as it is now for practical reasons, because of derived symbols. Otherwise it would have been more complicated. It’s possible that it will be changed in some later major version, but I doubt.

JeffYoung · October 18, 2021, 11:12pm

We also have much more serious performance issues with footprints because large directories are very slow on Windows…

marekr · October 19, 2021, 1:29am

Que

hermit · October 19, 2021, 2:38am

Interesting. Is this a file system thing?

Seth_h · October 19, 2021, 3:23am

TLDR; Linux keeps a top-level cache so that many things (like stat queries can proceed without actually touching each file. Windows does not. NTFS also has all sorts of layers for encryption/compression/etc that can add time. There are lots of other reasons but the upshot is that our approach in pretty files of having hundreds of small files is just about the worst thing you can ask Windows to do.

Here’s some details from the horse’s mouth for those who are curious: https://github.com/Microsoft/WSL/issues/873#issuecomment-425272829

davidsrsb · October 19, 2021, 4:22am

This has come up several times in this forum, the first display of a PCB takes a long time on a Windows PC with a traditional hard disk. The disk access times, the operating system and anti-virus software can make loading times in minutes. Windows tends to defragment files, but not directories.
SSDs help, but cannot eliminate the delay

greghuangqi · October 19, 2021, 6:21am

How about using a zip file or some other cache system to contain all these small files to speed up the reading? Just like what has done to all the icon files? Anyway, keeping one symbol in one file is the way to go with git.

JeffYoung · October 19, 2021, 10:15am

Oooh… that’s an interesting idea…

(But wouldn’t git also treat the zip as one file? Or are there ways to make git zip-cognizant?)

jp-charras · October 19, 2021, 10:48am

One symbol in one file is not possible: derived symboles (AKA Alises) must be in the root symbol file, because they share most of root data, so “One symbol in one file” makes no sense.

therefore this is not the way to go, even with git.

Footprints are a very different case: currently they have no derived footprints.

eelik · October 19, 2021, 11:08am

This is of course true only with the current system, and technically nothing is “must”. They could as well be in different files if one library would be one folder. For KiCad it would be just an implementation detail. I think the problem is what happens outside KiCad: it’s easier to keep the library under control when it can’t be handled with the system’s file manager.

That wouldn’t matter if the libraries could be zipped locally. Also, there’s already a stone age file format which isn’t compressed but has several files in one bunch: tar. KiCad could support it transparently, I believe.

Remember to check https://gitlab.com/kicad/code/kicad/-/issues/6941.

marekr · October 19, 2021, 12:39pm

Yea, enumerating a directory does not encounter any of the mentioned issues there. It’s getting file attributes that could be problematic as you can cause round trip “file opens” multiple times. Depending on architecture, it can be pointless compared to just opening the file.

twl · October 19, 2021, 12:56pm

How about allowing more than one symbol/footprint in .kicad_mod files (maybe with some name cache in the header to speed up parsing)? This way one can squeeze an entire library into a single .kicad_mod file, stay git friendly, fix Windows FS slugishness…

Tom

SembazuruCDE · October 19, 2021, 3:40pm

A strict “one symbol in one file” is problematic with derived symbols. (Derived symbols are the replacement for aliases, right?. I haven’t played with 5.99 yet to see how derived symbols are handled differently than aliases.) But if the rule was “one symbol and it’s derivatives in one file” then it would be possible to have individual files, allowing someone to, for example, simply copy a downloaded symbol file from their downloads folder into the library folder and KiCad would find it. (This currently works for footprints.) As it is with library management in 5.1.x with aliases, if you copy an alias or it’s parent you get the parent and all the aliases copied as if the combination of the parent and all aliases are considered one “symbol”.

But I can see the value of one file per library to reduce filesystem churn, especially on M$ Winblows systems like my own…

alexisvl · October 20, 2021, 3:37am

Problem with tar is it doesn’t support random access. We’d have to scan the entire tar file at load and cache offsets to the individual components. zip is almost certainly better for this application. It’d be cool if we could transparently support both folders and zip files for libraries.

Doesn’t play great with git but not all libraries live in git. IMO, it’d be a fine way to distribute release libs.

eelik · October 20, 2021, 11:08am

How about using a zipping format without compression? (For example 7z and zip should support this.)

qu1ck · October 20, 2021, 11:14am

It still has binary data so not much point as you dont get vcs compatibility and dont get space savings either (however miniscule they would be).

eelik · October 20, 2021, 11:36am

What do you mean?

Apart from the binary parts in the beginning and end of the file it’s plain text. On the other hand distributed development doesn’t work on this file because all changes affect the small binary parts and always clash. But this would work as a local solution to speed up the library loading, yet allowing a VCS for versioning/history.

I also wonder if there really isn’t any text based format with a TOC, like tar where the metadata would be copied in the beginning of the file as lines of text.

Loading is a real problem, as was said SSD drive doesn’t help much, my loading time is over a minute. And it happens quite often if the libraries are modified.

qu1ck · October 20, 2021, 11:39am

Exactly this. Most VCS don’t even attempt to diff binary files and are not suited to store them efficiently.

retiredfeline · October 20, 2021, 12:32pm

The repo storage format and the released library format can/should be separate anyway.