## Getting started You need a copy of https://kaikki.org/frwiktionary/raw-wiktextract-data.jsonl.gz ## Initial import speed Problem: current import speed is too slow. Current import speed with encoding/json: (1780000-990000)/(22:37:09-20:46:10) 790000/((22*3600+37*60+9)-(20*3600+46*60+10)) 119 inserts per second What if we: 1) use goccy/go-json for decoding? (40000)/(46*60+9)-(40*60+25) = 116 inserts per second Looks like the database is our bottleneck. 2) parallelize? 3) other performance optimizations? - https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite - wrap all inserts in one transaction: 410000/(29-13) = 25,625 inserts per second!! Much, much better! (using plain old encoding/json instead of goccy: about 20,000 per second) Decided on using goccy to unmarhsal, and doing everything in one SQLite transaction.