I love duckDB, my usual workflow is:
- initially read my data from whatever source (CSV, relational database somewhere, whatever)
- write it to one or more parquet files in a directory
- tell duckdb that the directory is my data source
Then duckdb treats the directory just like a databese that you can build indexes on, and since they're parquet files they're hella small and have static typing. It was pretty fast and efficient before, and duckdb has really sped up my data wrangling and analysis a ton.
I use it for exactly the same thing.
I used to spend hours agonizing over documenting things because I couldn't get the tone right, or in over explained, or some other stupid shit.
Now I give my llamafile the code, it gives me a reasonable set of documentation, I edit the documentation because the LLM isn't perfect, and I'm done in 10 minutes.