GitHub - tinybirdco/llm-benchmark: We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive… — Blankdot