Part management and spreadsheet vs true databases

bobc · May 25, 2019, 2:51pm

As it seems highly relevant, there is some discussion on the dev list regarding “atomic libraries” (for a particular definition of atomic). https://lists.launchpad.net/kicad-developers/msg40761.html begins

After some discussion, we are trying to decide whether implementing a basic atomic library support would be useful during v6

I don’t know who “we” refers to, or where the discussion was. Does anyone know?

I’m not clear how the proposal is different to what is possible now, I am slightly worried it just creates a new standard which is rather limited, and then there will be a third method to allow slightly more things, then a fourth method…

The definition of “atomic” seems to be continually watered down, to me it means all the data for a part in one place, not links to other files.

Rene_Poschl · May 25, 2019, 2:59pm

This might refer to the suggestion by @Seth_h

https://lists.launchpad.net/kicad-developers/msg40761.html

And as i somehow screwed up my first answer a second thread (i clicked answer instead of answer to mailing list in my mail program)

https://lists.launchpad.net/kicad-developers/msg40762.html

bobc · May 25, 2019, 3:02pm

No, that is the post I was quoting from! It states there was some discussion prior to that thread.

davidsrsb · May 25, 2019, 9:43pm

I agree the KiCad text file method does not scale. Obviously each company has a unique house number series and structure and sequence of tables.
This means that a database connector has to be totally customizable to access or select parameters that matter to KiCad
Do the commercial eCAD packages actually do a good job of this in practice?

nhatkhai · May 28, 2019, 4:45am

By the way, I believe google search engine is operate effectively on text base files for the whole internet. So I don’t think it is truth to not scale-able with text file. It just how we implementing to handling them.
[ADDITION] Windows at temp to improve their file search by implementing “indexing” cached similar like google (but not as good). So again, these are a proof of text file/binary file aren’t issue of scale-able. It is how we implementing it/handling it.

nhatkhai · May 28, 2019, 4:54am

I the good design - detail implement of database/files do not need to be know by third party software. It all above a flexible interface. So the underline detail implementation can be change/switch every second without broken a thing. And designer got flexibility to scale it up/down easily.

jwr · May 28, 2019, 11:19pm

Well, I am obviously biased, given that I am the author of PartsBox.io — but I will add my 2 cents to this discussion. Neither a spreadsheet nor a database are the right tools for managing parts, if what you mean by that is not just symbols+footprints+models, but also physical parts later on. I know, because I tried keeping track of my parts first in a spreadsheet, and then in a “simple database”.

As it turned out, even a “simple” parts inventory where you keep parts in storage locations quickly outgrows a spreadsheet, usually the moment you decide that some parts might be in several different locations.

Maintaining a priced BOM in a spreadsheet is also a losing battle: even if you manage to deal with multiple currencies and enter all the price breaks from multiple distributors, all of that information becomes outdated basically the moment the spreadsheet is saved.

All that said, I would like to achieve better interoperability with KiCad, especially given that I really enjoy using KiCad. I am planning to take a closer look at scripting in KiCad as a way to interface it to a PartsBox parts database, as well as to facilitate exporting a BOM for pricing and managing production later on.

Robin_Gangopadhya · May 29, 2019, 6:53pm

This latest information will interest Hildogjr as well.
kicost runs as-is in a Linux machine but it simply wont run in W10( at least mine).
BOM that comes out in Excel format is really very useful.
A lot can be easily added in the symbol library properties fields.
We will refine it as we move along. As to searching for parts in named distributor(s) comes really for us at the time of selecting a part: to check if it is shippable, non-stock, long lead item , high cost etc…we do intense search during design but from manufacturers’ sites.
What I feel would be very effective and productive in any PCB/sch tool- KiCAD as well- is to be able to right click on a part that is just selected to be put in the schematic: then its full part # from any previous project, comes up to be copied in. This process can include one footprint out of many you might have used before. For example, if you put R in a place, entering a value, right click will pop up 0402 if used most of the time. Then 0805 is used less often…so on & it places the full part number in a “PART NUMBER” field or “manf#” field.
Bear in mind, PART NUMBERS contain complete footprint information, implied vendor etc…so it is a simple matter to have that filled in as well.
That will boost productivity a lot- make doing this work less boring …
Now that I am at boring level, it would be nice if texts came out placed correctly as well…now, that will make it EXCITING…done at the “board set up” level.
r

bobc · May 29, 2019, 9:00pm

Do you really think Google index millions of text pages then keep the results in a CSV file?

Bigtable

A Distributed Storage System for Structured Data

Bigtable is a distributed storage system (built by Google) for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.

Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving).

Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products.

Some features

fast and extremely large-scale DBMS
a sparse, distributed multi-dimensional sorted map, sharing characteristics of both row-oriented and column-oriented databases.
designed to scale into the petabyte range
it works across hundreds or thousands of machines
it is easy to add more machines to the system and automatically start taking advantage of those resources without any reconfiguration
each table has multiple dimensions (one of which is a field for time, allowing versioning)
tables are optimized for GFS (Google File System) by being split into multiple tablets - segments of the table as split along a row chosen such that the tablet will be ~200 megabytes in size.

Architecture

BigTable is not a relational database. It does not support joins nor does it support rich SQL-like queries. Each table is a multidimensional sparse map. Tables consist of rows and columns, and each cell has a time stamp. There can be multiple versions of a cell with different time stamps. The time stamp allows for operations such as “select ‘n’ versions of this Web page” or “delete cells that are older than a specific date/time.”

In order to manage the huge tables, Bigtable splits tables at row boundaries and saves them as tablets. A tablet is around 200 MB, and each machine saves about 100 tablets. This setup allows tablets from a single table to be spread among many servers. It also allows for fine-grained load balancing. If one table is receiving many queries, it can shed other tablets or move the busy table to another machine that is not so busy. Also, if a machine goes down, a tablet may be spread across many other servers so that the performance impact on any given machine is minimal.

Tables are stored as immutable SSTables and a tail of logs (one log per machine). When a machine runs out of system memory, it compresses some tablets using Google proprietary compression techniques (BMDiff and Zippy). Minor compactions involve only a few tablets, while major compactions involve the whole table system and recover hard-disk space.

The locations of Bigtable tablets are stored in cells. The lookup of any particular tablet is handled by a three-tiered system. The clients get a point to a META0 table, of which there is only one. The META0 table keeps track of many META1 tablets that contain the locations of the tablets being looked up. Both META0 and META1 make heavy use of pre-fetching and caching to minimize bottlenecks in the system.

Implementation

BigTable is built on Google File System (GFS), which is used as a backing store for log and data files. GFS provides reliable storage for SSTables, a Google-proprietary file format used to persist table data.

Another service that BigTable makes heavy use of is Chubby , a highly-available, reliable distributed lock service. Chubby allows clients to take a lock, possibly associating it with some metadata, which it can renew by sending keep alive messages back to Chubby. The locks are stored in a filesystem-like hierarchical naming structure.

There are three primary server types of interest in the Bigtable system:

Master servers: assign tablets to tablet servers, keeps track of where tablets are located and redistributes tasks as needed.
Tablet servers: handle read/write requests for tablets and split tablets when they exceed size limits (usually 100MB - 200MB). If a tablet server fails, then a 100 tablet servers each pickup 1 new tablet and the system recovers.
Lock servers: instances of the Chubby distributed lock service. Lots of actions within BigTable require acquisition of locks including opening tablets for writing, ensuring that there is no more than one active Master at a time, and access control checking.

bobc · May 29, 2019, 9:04pm

So a database is the wrong tool, but your program uses a database. Not sure your addition here is more than an excuse for an advert. And we are still not talking about managing inventory.

jwr · May 29, 2019, 9:58pm

@bobc: not sure why the hostility. There are other tools out there if you don’t like mine, and some are free and open-source (like PartKeepr for instance).

My point was that there is much more to managing parts inventory than just storing information (which is what a database does). Every program uses a database underneath. You might want to write your own program to manage your parts collection, which will use a database. I was trying to point out that the database itself isn’t enough. I know that, because I tried, both with a spreadsheet and then using PosgreSQL.

But I think this discussion is about two concepts: keeping data about parts you might use in a design and keeping data about actual inventory (physical parts that you have). It so happens that my application addresses both (although it’s much more extensive on the inventory side), but I’d agree that keeping data about parts you might use is simpler than managing actual inventory.

nhatkhai · May 30, 2019, 12:04am

It is the detail of implementation of caching indexing… so who care what it in what form!

nhatkhai · May 30, 2019, 12:18am

The most handy form for human are text. So if we stick with text - tooling is a secondary, not a must have. If we go with form that can not consume raw by human - we got to have a lot of tools to go about it. This doesn’t mean we should not use binary files. The binary form usually is the way make thing work faster and efficient, but I see we don’t need to use it as a primary source of storage. It is a detail implementation that should be transparent from the user.

joost · June 1, 2019, 3:29pm

And i am thinking the storage type (text etc) is a detail we should not care about, database specs we should not care about either. We should care about workflow and that alone and let designers pick the right technologies.

I see that few (any?) on this thread actually code but the discussion is mostly about what tools to pick and tech to use to solve our problem. We should lay out our use-cases and workflows instead; that will make the designer’s life easier and the outcome more interesting/relevant

nhatkhai · June 3, 2019, 2:03am

Good point :-).
I was thinking the whole time that mine use case was api take json with filed, and filed values (regex), return a enumerator of matched parts… some thing like that.

However, I do not know the intend use cases of this topic is about.

nhatkhai · June 10, 2019, 8:14pm

Good stuff - relative to this

SchrodingersGat · June 18, 2019, 12:00pm

Hi guys, I saw this discussion pop up and thought you might be interested in a side project I have been working on. It’s an (open source) inventory management system which I have been building out of frustration with the existing solutions (both free and commercial).

The project started out of a need to organize electronics parts and builds (but it is not specific to electronic parts by any means). I am focusing on inventory and stock management, and BOM / builds.

Ideally I’d like to come up with a way of integrating directly with KiCad (this was the original design goal for this project but it has ballooned a bit). There’s a REST API and a Python library, so the KiCad integration just needs some careful consideration. I haven’t come up with a good solution for this part yet.

A couple of brief screenshots:

This project is still under development, but is being used in the “real world”. If anyone feels like making use of it, or even helping me to develop it, that’d be great! This is my first foray into the world of web development so please forgive if it’s a bit clunky.

Also the documentation is currently pretty sparse!

I hope someone finds it useful

qu1ck · June 18, 2019, 10:52pm

1500 commits in 3 months

Amazing effort, I’ll definitely check this out. If it solves some issues I have with my current inventory system (mainly around storage organization) I’ll happily migrate.

Goldbug · June 24, 2019, 8:31am

This looks incredibly nice! I will definitely try this out. With KiCad integration it would be exactly what I have been looking for.

system · September 22, 2019, 8:38am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.