This is exactly how I was thinking we could implement downloading footprints and symbols. What is fundamentally wrong with this approach assuming that each KiCad user downloads 5% of the library?
One big issue is how KiCad gets a directory listing for the 3D files, even before it curls them down. Internally KiCad uses the GitHub API to get a directory listing of the available models. This “mostly” works except that the GitHub API has moved on in some regards. Specifically, large directories with (IIRC) > 100 models will not report some of the models.
What are the potential workarounds of Git not supporting per-file download?
Use Git as God intended it to be used, and clone the entire repo
Use the SVN interface to GitHub which allows directory traversal and pulling down individual files, and is not limited by the restrictions placed upon API access. I have done some experimenting with Python SVN library and it seems like it has potential. There would be a lot of work required to make this into a fully-fledged library tool, but I think it is worth further investigation.
Host each model (zipped) on a website that gets rebuilt every 48hr. Users can download a single model (zipped) or a single folder of models (e.g. pin headers) zipped. With good 7z compression, pin header models are only a couple of meg (for the entire set!). What a weirdly specific answer, you may say… Almost as if I have been working on this exact thing…
In a very special version of KiCad that lives only on my laptop, you can also natively array models internally to KiCad, to produce (for example) an entire family of pin headers from a single model file…
Once I get export-to-STEP functionality fully functional I will submit to the developers mailing list.
My message on another thread specifically addresses this with an out-of-band file of hashes that could be used to determine which files/folders need updating. SVN may indeed be be the path to downloading these files or folders. Does SVN maintain and PythonSVN allow retrieval of file or folder server-side hashes?
I can see a decent path to implementation using scripted generation of a hash file listing file and folder hashes on the server, local generation of user’s file and folder hashes and either raw download of github files or SVN download of files and/or folders. With threshold-based git cloning when necessary.
Yes I saw that post on the other thread and was somewhat perplexed by it. The series of hashes you describe is exactly the functionality that SVN and Git already provide. Once you are linked into the library via SVN, you already have the tools you need to determine which files have changed.
If you (or other users) want a local copy of the 3D files that can tell you which files have changed, then this solution already exists. Clone that using Git and you have achieved your stated goals.
The SVN approach (might) solve a different problem, which is as-needed access of individual model files.
But git does not provide hashes of individual files or folders, only of commits and entire repositories. I don’t see how to get hashes of individual files from git. Do you?
Perhaps SVN does provide file hashes? If so, then I agree that the out-of-band file of hashes isn’t necessary.
I apologize for the time you’re spending on this, please let me know if I just need to learn more about git.
As far as I can tell, the git-diff-tree is the closest command on that page:
The “git-diff-tree” command begins its output by printing the hash of what is being compared. After that, all the commands print one output line per changed file.
The git-diff-index or -files compare local files with each other or the local index. Is there a way to specify a remote repository?
The SVN Book describes the svn info command. But it’s output doesn’t include remote repository checksums nor local directory checksums, only local file checksums.
Am I missing something?
Checksum is only in this first excerpt:
svn info will show you all the useful information that it has for items in your working copy. It will show information for files:
>$ svn info foo.c
Path: foo.c
Name: foo.c
Working Copy Root Path: /home/sally/projects/test
URL: http://svn.red-bean.com/repos/test/foo.c
Repository Root: http://svn.red-bean.com/repos/test
Repository UUID: 5e7d134a-54fb-0310-bd04-b611643e5c25
Revision: 4417
Node Kind: file
Schedule: normal
Last Changed Author: sally
Last Changed Rev: 20
Last Changed Date: 2003-01-13 16:43:13 -0600 (Mon, 13 Jan 2003)
Text Last Updated: 2003-01-16 21:18:16 -0600 (Thu, 16 Jan 2003)
Properties Last Updated: 2003-01-13 21:50:19 -0600 (Mon, 13 Jan 2003)
Checksum: d6aeb60b0662ccceb6bce4bac344cb66
No checksum here for local directories:
It will also show information for directories:
>$ svn info vendors
Path: vendors
Working Copy Root Path: /home/sally/projects/test
URL: http://svn.red-bean.com/repos/test/vendors
Repository Root: http://svn.red-bean.com/repos/test
Repository UUID: 5e7d134a-54fb-0310-bd04-b611643e5c25
Revision: 19
Node Kind: directory
Schedule: normal
Last Changed Author: harry
Last Changed Rev: 19
Last Changed Date: 2003-01-16 23:21:19 -0600 (Thu, 16 Jan 2003)
Properties Last Updated: 2003-01-16 23:39:02 -0600 (Thu, 16 Jan 2003)
No checksum for remote files:
svn info also acts on URLs (also note that the file readme.doc in this example is locked, so lock information is also provided):
>$ svn info http://svn.red-bean.com/repos/test/readme.doc
Path: readme.doc
Name: readme.doc
URL: http://svn.red-bean.com/repos/test/readme.doc
Repository Root: http://svn.red-bean.com/repos/test
Repository UUID: 5e7d134a-54fb-0310-bd04-b611643e5c25
Revision: 1
Node Kind: file
Schedule: normal
Last Changed Author: sally
Last Changed Rev: 42
Last Changed Date: 2003-01-14 23:21:19 -0600 (Tue, 14 Jan 2003)
Lock Token: opaquelocktoken:14011d4b-54fb-0310-8541-dbd16bd471b2
Lock Owner: harry
Lock Created: 2003-01-15 17:35:12 -0600 (Wed, 15 Jan 2003)
Lock Comment (1 line):
My test lock comment
Try not to get too focused on file “checksum” or “hash”. Both Git and SVN have the ability to compare a local repository with a remote repository and tell you exactly which files have changed.
Which is the advantage of building a connector on the fly, generating the exported step model from an array of single parts?
When you will build the step connector for exporting, you will have the same size of the connector you would have downloading it from the online library…
Then a solution to download only what is required to export/visualize would offer a simpler solution compared to the one that would build the model on the fly…
Moreover which control on mechanical geometries of the built part could you do automatically? This is a must for a mechanical environment.
For example THT Resistors in the kicad libraries have geometry issues and they will produce a full exported board with geometries issues.
But very little noticed that. The geometry checking is a must in MCAD.
Only building 3D models having in mind a MCAD environment would give the right result.
Here the wrong STEP THT resistor models, they just display fine in some CAD, but they have geometry issues.
The geometry check is a delicate process that can be done efficiently only when the library has been built.
And in which case you can adopt the array building process, a part the simple pin-headers?
Can you think of simply doing it for box pin-headers?
I would focus on GitHub issue, trying to let users download or update only what they use, leaving the library a solid MCAD source of checked mechanical models, as it is done by the manufacturers.
I believe that the next version of KiCad (v5) will be released with footprints bundled, and the default will be to use local footprints. The github plugin/repos will still be available for people who want to use it. That will give plenty of time until KiCad v6 to develop something better. In the meantime, prototypes can be developed with external scripts.
There seem to be several use cases. Data requirements:
Symbols + footprints, but no 3D models
Symbols + footprints, only WRL models
Symbols + footprints, WRL and STEP models
Network use cases:
No network access (fully offline)
Limited bandwidth access
Unlimited bandwidth access
Network access via proxy (behind corporate firewalls)
Update methods:
Use versioned snapshot of all data
Periodic download of all latest data
On demand retrieval of latest data for subset of components
Currently KiCad works well for a couple of those use cases, which probably covers 99% of users. It might difficult to cover all use cases with one script, I think a set of scripts to cover different use cases would be more manageable.
Another option (one of many) would be to use rsync. It can synchronize individual files, or a whole recursive set of directories, and can be told “grab changed versions of files that I have, but don’t download any new files from those directories”. Changed-file detection can be done on the basis of modification date and/or hash-of-contents.
I can sort of see where Wayne is coming from, we could end up with a partial fix which seems expedient but is not extendable and really skirts the fundamental issue, which is where the github plugin has ended up.Kind of like adding lifeboats when we really need to be steering around the iceberg.
Handling and deploying large amounts of data is not a simple fix, and a lot of projects with any sort of online interaction find bespoke solutions, but there is not really any off the shelf solution. Any design really needs a tailored server side setup, using standard servers such as HTTP, git, ftp whatever provide generic solutions not exactly what we want.
Downloading individual files on demand would be a worse solution, it is much less efficient and unusable by people who work offline.
Really, the problem needs to be split between the representation of the data (including compression), and how the data is deployed (offline, batch update, on demand update). Trying to do both at once is always going to be much harder. Many projects have a similar problem, and implement some form of compressed packages which can be installed with the program, or downloaded/updates as needed. A manifest within the package tells the program what is in the package.
I believe part of the solution is a dedicated library manager that would allow the selection of packages (schematic symbols, footprints and 3D parts) from varying sources. The problem is big enough and complex enough to warrant a better solution than just trying to shoehorn in management into either eeschema or pcbnew. This would allow dealing with compressed files, multiple repos, editing etc.
Like a better open source version of Altiums Vault - but much much better.
I agree that this is the superior solution. The array feature was something suggested on the forums and I thought it would be a good feature to have - independent of addressing the library size issues.
Perhaps suggest that he pays himself the cost that ends up being billed to the end user?
If so, he owes me $15 USD already!
I can, and probably most others can, work around the issue. However, when I posted this there was zero reason to think that downloading this was going to be over 1Gig.
I don’t blame the developers for what happened to me. Though I do wish they take into consideration providing a warning when the download size is substantially larger then normal.