Home - Gaze Dashboard

GitHub Trending

anthropics/claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.Claude Code Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows -- all through natural language commands. Use it in your terminal, IDE, or tag @claude on Github. Learn more in the official documentation. Get started Install Claude Code: MacOS/Linux: curl -fsSL https://claude.ai/install.sh | bash Homebrew (MacOS): brew install --cask claude-code Windows: irm https://claude.ai/install.ps1 | iex NPM: npm install -g @anthropic-ai/claude-code NOTE: If installing with NPM, you also need to install Node.js 18+ Navigate to your project directory and run claude. Plugins This repository includes several Claude Code plugins that extend functionality with custom commands and agents. See the plugins directory for detailed documentation on available plugins. Reporting Bugs We welcome your feedback. Use the /bug command to report issues directly within Claude Code, or file a GitHub issue. Connect on Discord Join the Claude Developers Discord to connect with other developers using Claude Code. Get help, share feedback, and discuss your projects with the community. Data collection, usage, and retention When you use Claude Code, we collect feedback, which includes usage data (such as code acceptance or rejections), associated conversation data, and user feedback submitted via the /bug command. How we use your data See our data usage policies. Privacy safeguards We have implemented several safeguards to protect your data, including limited retention periods for sensitive information, restricted access to user session data, and clear policies against using feedback for model training. For full details, please review our Commercial Terms of Service and Privacy Policy.

anomalyco/opencode

The open source coding agent. The open source AI coding agent. Installation # YOLO curl -fsSL https://opencode.ai/install | bash # Package managers npm i -g opencode-ai@latest # or bun/pnpm/yarn scoop bucket add extras; scoop install extras/opencode # Windows choco install opencode # Windows brew install anomalyco/tap/opencode # macOS and Linux (recommended, always up to date) brew install opencode # macOS and Linux (official brew formula, updated less) paru -S opencode-bin # Arch Linux mise use -g opencode # Any OS nix run nixpkgs#opencode # or github:anomalyco/opencode for latest dev branch [!TIP] Remove versions older than 0.1.x before installing. Desktop App (BETA) OpenCode is also available as a desktop application. Download directly from the releases page or opencode.ai/download. Platform Download macOS (Apple Silicon) opencode-desktop-darwin-aarch64.dmg macOS (Intel) opencode-desktop-darwin-x64.dmg Windows opencode-desktop-windows-x64.exe Linux .deb, .rpm, or AppImage # macOS (Homebrew) brew install --cask opencode-desktop Installation Directory The install script respects the following priority order for the installation path: $OPENCODE_INSTALL_DIR - Custom installation directory $XDG_BIN_DIR - XDG Base Directory Specification compliant path $HOME/bin - Standard user binary directory (if exists or can be created) $HOME/.opencode/bin - Default fallback # Examples OPENCODE_INSTALL_DIR=/usr/local/bin curl -fsSL https://opencode.ai/install | bash XDG_BIN_DIR=$HOME/.local/bin curl -fsSL https://opencode.ai/install | bash Agents OpenCode includes two built-in agents you can switch between with the Tab key. build - Default, full access agent for development work plan - Read-only agent for analysis and code exploration Denies file edits by default Asks permission before running bash commands Ideal for exploring unfamiliar codebases or planning changes Also, included is a general subagent for complex searches and multistep tasks. This is used internally and can be invoked using @general in messages. Learn more about agents. Documentation For more info on how to configure OpenCode head over to our docs. Contributing If you're interested in contributing to OpenCode, please read our contributing docs before submitting a pull request. Building on OpenCode If you are working on a project that's related to OpenCode and is using "opencode" as a part of its name; for example, "opencode-dashboard" or "opencode-mobile", please add a note to your README to clarify that it is not built by the OpenCode team and is not affiliated with us in any way. FAQ How is this different from Claude Code? It's very similar to Claude Code in terms of capability. Here are the key differences: 100% open source Not coupled to any provider. Although we recommend the models we provide through OpenCode Zen; OpenCode can be used with Claude, OpenAI, Google or even local models. As models evolve the gaps between them will close and pricing will drop so being provider-agnostic is important. Out of the box LSP support A focus on TUI. OpenCode is built by neovim users and the creators of terminal.shop; we are going to push the limits of what's possible in the terminal. A client/server architecture. This for example can allow OpenCode to run on your computer, while you can drive it remotely from a mobile app. Meaning that the TUI frontend is just one of the possible clients. Join our community Discord | X.com

netbirdio/netbird

Connect your devices into a secure WireGuard®-based overlay network with SSO, MFA and granular access controls. Start using NetBird at netbird.io See Documentation Join our Slack channel or our Community forum New: NetBird terraform provider NetBird combines a configuration-free peer-to-peer private network and a centralized access control system in a single platform, making it easy to create secure private networks for your organization or home. Connect. NetBird creates a WireGuard-based overlay network that automatically connects your machines over an encrypted tunnel, leaving behind the hassle of opening ports, complex firewall rules, VPN gateways, and so forth. Secure. NetBird enables secure remote access by applying granular access policies while allowing you to manage them intuitively from a single place. Works universally on any infrastructure. Open Source Network Security in a Single Platform https://github.com/user-attachments/assets/10cec749-bb56-4ab3-97af-4e38850108d2 NetBird on Lawrence Systems (Video) Key features Connectivity Management Security Automation Platforms - [x] Kernel WireGuard - [x] Admin Web UI - [x] SSO & MFA support - [x] Public API - [x] Linux - [x] Peer-to-peer connections - [x] Auto peer discovery and configuration - [x] Access control - groups & rules - [x] Setup keys for bulk network provisioning - [x] Mac - [x] Connection relay fallback - [x] IdP integrations - [x] Activity logging - [x] Self-hosting quickstart script - [x] Windows - [x] Routes to external networks - [x] Private DNS - [x] Device posture checks - [x] IdP groups sync with JWT - [x] Android - [x] NAT traversal with BPF - [x] Multiuser support - [x] Peer-to-peer encryption - [x] iOS - [x] Quantum-resistance with Rosenpass - [x] OpenWRT - [x] Periodic re-authentication - [x] Serverless - [x] Docker Quickstart with NetBird Cloud Download and install NetBird at https://app.netbird.io/install Follow the steps to sign-up with Google, Microsoft, GitHub or your email address. Check NetBird admin UI. Add more machines. Quickstart with self-hosted NetBird This is the quickest way to try self-hosted NetBird. It should take around 5 minutes to get started if you already have a public domain and a VM. Follow the Advanced guide with a custom identity provider for installations with different IDPs. Infrastructure requirements: A Linux VM with at least 1CPU and 2GB of memory. The VM should be publicly accessible on TCP ports 80 and 443 and UDP port: 3478. Public domain name pointing to the VM. Software requirements: Docker installed on the VM with the docker-compose plugin (Docker installation guide) or docker with docker-compose in version 2 or higher. jq installed. In most distributions Usually available in the official repositories and can be installed with sudo apt install jq or sudo yum install jq curl installed. Usually available in the official repositories and can be installed with sudo apt install curl or sudo yum install curl Steps Download and run the installation script: export NETBIRD_DOMAIN=netbird.example.com; curl -fsSL https://github.com/netbirdio/netbird/releases/latest/download/getting-started.sh | bash Once finished, you can manage the resources via docker-compose A bit on NetBird internals Every machine in the network runs NetBird Agent (or Client) that manages WireGuard. Every agent connects to Management Service that holds network state, manages peer IPs, and distributes network updates to agents (peers). NetBird agent uses WebRTC ICE implemented in pion/ice library to discover connection candidates when establishing a peer-to-peer connection between machines. Connection candidates are discovered with the help of STUN servers. Agents negotiate a connection through Signal Service passing p2p encrypted messages with candidates. Sometimes the NAT traversal is unsuccessful due to strict NATs (e.g. mobile carrier-grade NAT) and a p2p connection isn't possible. When this occurs the system falls back to a relay server called TURN, and a secure WireGuard tunnel is established via the TURN server. Coturn is the one that has been successfully used for STUN and TURN in NetBird setups. See a complete architecture overview for details. Community projects NetBird installer script NetBird ansible collection by Dominion Solutions Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases. Support acknowledgement In November 2022, NetBird joined the StartUpSecure program sponsored by The Federal Ministry of Education and Research of The Federal Republic of Germany. Together with CISPA Helmholtz Center for Information Security NetBird brings the security best practices and simplicity to private networking. Testimonials We use open-source technologies like WireGuard®, Pion ICE (WebRTC), Coturn, and Rosenpass. We very much appreciate the work these guys are doing and we'd greatly appreciate if you could support them in any way (e.g., by giving a star or a contribution). Legal This repository is licensed under BSD-3-Clause license that applies to all parts of the repository except for the directories management/, signal/ and relay/. Those directories are licensed under the GNU Affero General Public License version 3.0 (AGPLv3). See the respective LICENSE files inside each directory. WireGuard and the WireGuard logo are registered trademarks of Jason A. Donenfeld.

MiroMindAI/MiroThinker

MiroThinker is an open-source search agent model, built for tool-augmented reasoning and real-world information seeking, aiming to match the deep research experience of OpenAI Deep Research and Gemini Deep Research. 🚀 Try our Demo! MiroThinker is MiroMind's Flagship Research Agent Model. It is an open-source search model designed to advance tool-augmented reasoning and information-seeking capabilities, enabling complex real-world research workflows across diverse challenges. The project currently comprises four key components: 💡 MiroThinker: An open-source search model that natively supports tool-assisted reasoning, achieving leading performance across multiple benchmarks (e.g., HLE, HLE-Text-2158, HLE-Text-500, BrowseComp, BrowseComp-ZH, GAIA, XBench-DeepSearch, FutureX, and Frames). See Quick Start. 🤖 MiroFlow: An open-source research agent framework that offers reproducible state-of-the-art performance across multiple benchmarks. See MiroFlow for details. 📚 MiroVerse: A premium open-source training dataset with 147k samples supporting research agent training. See MiroVerse on HuggingFace. 🔧 MiroTrain / MiroRL: Training infrastructure that supports stable and efficient training for research agent models. See MiroTrain and MiroRL for details. 📋 Table of Contents 📰 News & Updates 📝 Introduction ✨ Key Features 📈 Performance on Benchmarks 🚀 Quick Start 📊 Benchmark Evaluation 🔬 Trace Collection ❓ FAQ & Troubleshooting 📄 License 🙏 Acknowledgments 📰 News & Updates [2026-01-05] 🎉🎉 We release MiroThinker-v1.5, a world-leading open-source search agent. MiroThinker-v1.5-30B surpasses Kimi-K2-Thinking on BrowseComp-ZH at much lower cost, using only 1/30 of the parameters. MiroThinker-v1.5-235B scores 39.2% on HLE-Text, 69.8% on BrowseComp, 71.5% on BrowseComp-ZH, and 80.8% on GAIA-Val-165, setting a new state-of-the-art among search agents. [2025-11-13] 🎉 MiroThinker-v1.0 is now released! Introducing interactive scaling as a third dimension of performance improvement, MiroThinker v1.0 supports 256K context window and up to 600 tool calls per task. Available in 8B, 30B, and 72B parameter scales, achieving 37.7%, 47.1%, 55.6%, and 81.9% on HLE-Text, BrowseComp, BrowseComp-ZH, and GAIA-Text-103, respectively. See Technical Report for more details. [2025-09-11] MiroThinker-72B-Preview ranked 4th in this week's FutureX benchmark. See FutureX. 📜 Click to expand older updates [2025-09-08] MiroThinker-v0.2 is now released, achieving open-source SOTA performance across multiple benchmarks, including HLE (17.8%), HLE-Text-Only (19.1%), BrowseComp-EN (17.2%), BrowseComp-ZH (29.4%), XBench-DeepSearch (56.0%), and Frames (74.8%). [2025-09-07] We supported more benchmarks, including BrowseComp-ZH, XBench-DeepSearch, and FutureX. We plan to add more benchmarks in the future. [2025-08-22] Introducing streamlined deployment options for MiroThinker models with optimized resource usage and faster startup times. Experience the interactive demo: 🚀 Try Gradio Demo [2025-08-08] MiroThinker-v0.1 released. Models, framework, and data are now fully open-sourced! 📝 Introduction MiroThinker-v1.5 MiroThinker v1.5 is the world-leading open-source search agent that advances tool-augmented reasoning through interactive scaling — training the model to handle deeper and more frequent agent-environment interactions as a third dimension of performance improvement, beyond model size and context length. Key Features 🚀 MiroThinker v1.5 supports a 256K context window, long-horizon reasoning, and deep multi-step analysis. 🔧 Handles up to 400 tool calls per task — a substantial improvement over previous open-source research agents. 📦 Released in 30B and 235B parameter scales, accompanied by a comprehensive suite of tools and workflows to flexibly support diverse research settings and compute budgets. Model Name Base Model Max Context Max Tool Calls HF Link MiroThinker-v1.5-30B Qwen3-30B-A3B-Thinking-2507 256K 400 🤗 link MiroThinker-v1.5-235B Qwen3-235B-A22B-Thinking-2507 256K 400 🤗 link MiroThinker v1.5 demonstrates strong general-research performance across a broad range of benchmarks, achieving 39.2%, 69.8%, 71.5%, and 80.8% on HLE-Text, BrowseComp, BrowseComp-ZH, and GAIA-Val-165, respectively. These results surpass previous open-source agents and set the new world-leading BrowseComp performance. MiroThinker-v1.0 📦 Click to expand MiroThinker-v1.0 details Unlike previous agents that scale only model size or context length, MiroThinker v1.0 introduces interactive scaling at the model level, systematically training the model to handle deeper and more frequent agent–environment interactions as a third dimension of performance improvement. Interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories. ✨ Key Features 🚀 256K Context Window: Supports long-horizon reasoning and deep multi-step analysis 🔧 600 Tool Calls: Handles up to 600 tool calls per task — a substantial improvement over previous open-source research agents 📦 Multiple Scales: Released in 8B, 30B, and 72B parameter scales, accompanied by a comprehensive suite of tools and workflows to flexibly support diverse research settings and compute budgets Model Name Base Model Max Context Max Tool Calls HF Link MiroThinker-v1.0-8B Qwen3-8B 256K 600 🤗 link MiroThinker-v1.0-30B Qwen3-30B-A3B-Thinking-2507 256K 600 🤗 link MiroThinker-v1.0-72B Qwen2.5-72B-Instruct 256K 600 🤗 link MiroThinker v1.0 demonstrates strong general-research performance across a broad range of benchmarks, achieving 37.7%, 47.1%, 55.6%, and 81.9% on HLE-Text, BrowseComp, BrowseComp-ZH, and GAIA-Text-103, respectively. These results surpass previous open-source agents and narrow the gap with commercial counterparts such as GPT-5-high. MiroThinker-v0.2 📦 Click to expand MiroThinker-v0.2 details In this new version, we introduced three key improvements: 📚 Richer training data from both English and Chinese sources, yielding significant gains in benchmark performance and generalization 🎯 Unified DPO training with a single preference dataset across all models 📏 Extended context length from 40k to 64k for more challenging multi-turn tool-use tasks Compared to v0.1, MiroThinker v0.2 delivers consistent gains across benchmarks. For example, scores improved from 57.3 → 64.1 on GAIA-Text-103 and from 17.0 → 29.4 on BrowseComp-ZH, reflecting substantial advancements in the model’s general research agent capabilities. Model Name Base Model Max Context HF Link MiroThinker-4B-SFT-v0.2 Qwen3-4B 64K 🤗 link MiroThinker-4B-DPO-v0.2 Qwen3-4B 64K 🤗 link MiroThinker-8B-SFT-v0.2 Qwen3-8B 64K 🤗 link MiroThinker-8B-DPO-v0.2 Qwen3-8B 64K 🤗 link MiroThinker-14B-SFT-v0.2 Qwen3-14B 64K 🤗 link MiroThinker-14B-DPO-v0.2 Qwen3-14B 64K 🤗 link MiroThinker-32B-SFT-v0.2 Qwen3-32B 64K 🤗 link MiroThinker-32B-DPO-v0.2 Qwen3-32B 64K 🤗 link MiroThinker-v0.1 📦 Click to expand MiroThinker-v0.1 details Performance of Open-Source Models on GAIA-Validation Benchmark. We have released the MiroThinker v0.1 series, including both SFT and DPO variants at parameter scales of 8B, 14B, and 32B. Notably, MiroThinker v0.1 achieves state-of-the-art performance among open-source models on the GAIA benchmark, a rigorous evaluation suite for advanced agentic capabilities, demonstrating its strength in long-context, decision-intensive, and real-world task scenarios. Model Name Base Model Max Context HF Link MiroThinker-8B-SFT-v0.1 Qwen3-8B 40K 🤗 link MiroThinker-8B-DPO-v0.1 Qwen3-8B 40K 🤗 link MiroThinker-14B-SFT-v0.1 Qwen3-14B 40K 🤗 link MiroThinker-14B-DPO-v0.1 Qwen3-14B 40K 🤗 link MiroThinker-32B-SFT-v0.1 Qwen3-32B 40K 🤗 link MiroThinker-32B-DPO-v0.1 Qwen3-32B 40K 🤗 link ✨ Key Features 🤖 MiroThinker-Optimized Framework 🔓 Fully Open-Source Agent Framework: Complete transparency with open framework and open models 🔗 Tool Integration: Seamless integration with external tools and APIs 📝 Trace Collection: Comprehensive logging and analysis of agent interactions with elapsed time and estimated completion time displayed in minutes. Ready for SFT and DPO 📊 Benchmark Evaluation: Extensive testing across multiple benchmark datasets 📊 Comprehensive Benchmark Suite 📋 Click to expand benchmark list GAIA Validation: A benchmark for General AI Assistants. (paper) GAIA-Text-103: A subset of GAIA Validation for text-only tasks. (paper) HLE: Humanity's Last Exam. (paper) HLE-Text-2158: A subset of HLE for text-only tasks. (paper) HLE-Text-500: A subset of HLE for text-only tasks, created by WebThinker. (paper) BrowseComp-EN: Web browsing and comprehension tasks. (paper) BrowseComp-ZH: A Chinese version of BrowseComp. (paper) WebWalkerQA: Web navigation and question answering. (paper) Frames: Factuality, Retrieval, And reasoning MEasurement Set. (paper) XBench-DeepSearch: A benchmark for deep research agents. (website) FutureX: A live benchmark designed for predicting unknown future. (website) SEAL-0: A benchmark for evaluating LLMs on conflicting-evidence web questions. (paper) AIME2025: American Invitational Mathematics Examination 2025. (website) DeepSearchQA: Google's Deep Search Question Answering benchmark. (paper) 📈 Performance on Benchmarks MiroThinker-v1.5 To prevent potential information leakage (e.g., searching benchmark answers from HuggingFace), access to HuggingFace has been explicitly disabled in these tools. We further perform canary string testing on the tool outputs of all trajectories and disregard any trajectory found to be contaminated, treating it as an incorrect answer. MiroThinker-v1.0 📦 Click to expand MiroThinker-v1.0 details MiroThinker-v0.2 📦 Click to expand MiroThinker-v0.2 details Comparison with SOTA Research Agents GAIA Benchmark MiroThinker-v0.1 📦 Click to expand MiroThinker-v0.1 details GAIA Benchmark Method Text-103Best Pass@1 Text-103Pass@1 (Avg@8) Val-165Best Pass@1 Val-165Pass@1 (Avg@8) 🔹—— 7B/8B Models —— Search-o1-7B 17.5 - - - R1-Searcher-7B 20.4 - - - WebDancer-7B 31.0 - - - WebSailor-7B 37.9 - - - CK-Pro-8B 40.3 - 32.7 - MiroThinker-8B-SFT-v0.1 44.7 40.1 34.6 31.8 + Commercial Tools 46.6 42.1 37.6 33.9 MiroThinker-8B-DPO-v0.1 46.6 44.8 37.0 35.4 + Commercial Tools 50.5 46.7 38.2 35.9 🔹—— 14B Models —— MiroThinker-14B-SFT-v0.1 47.6 44.4 37.0 34.4 + Commercial Tools 49.5 47.5 41.8 39.8 MiroThinker-14B-DPO-v0.1 48.5 46.6 42.4 39.2 + Commercial Tools 52.4 48.5 45.5 42.0 🔹—— 32B Models —— Qwen3-32B 31.1 26.7 29.7 26.4 Search-o1-32B 28.2 - - - WebThinker-32B-RL 48.5 - - - WebDancer-QwQ-32B 51.5 - - - WebSailor-32B 53.2 - - - WebShaper-QwQ-32B 53.3 - - - MiroThinker-32B-SFT-v0.1 55.3 51.3 44.9 42.7 + Commercial Tools 58.3 54.2 48.5 45.8 MiroThinker-32B-DPO-v0.1 57.3 54.1 48.5 45.9 + Commercial Tools 60.2 57.9 50.9 48.9 Following the practices of WebThinker, WebAgents, and CognitiveKernel, we report the Best Pass@1, the highest score across three runs, which often reflects stronger performance, though it may exhibit some variability. To provide a more stable measure, we additionally report Pass@1 (Avg@8), which offers greater consistency at the cost of slightly lower scores. For consistency with prior open-source works, we evaluate GAIA-Text-103 using the WebAgents LLM-as-a-Judge template, and report results on GAIA-Val-165 using the official GAIA scorer script. By default, we use open-source tools wherever possible, except for the code tool E2B and the Google search tool Serper. We use Whisper, Qwen2.5-VL-72B-Instruct, and Qwen3-235B-A22B-Thinking-2507 in our implementation. The framework can be easily extended to other open-source tools of your choice. Replacing these open-source tools with commercial alternatives can yield performance gains. Commercial tools were mainly used for multimodal capabilities and certain complex reasoning subtasks. The majority of tasks, including planning, browsing, refinement, navigation, and more, were handled by our models. More Benchmarks Method HLEPass@1 FramesPass@1 BrowseCompPass@1 BrowseComp-ZHPass@1 WebWalkerQAPass@1 OpenAI Deep Research 26.6 - 51.5 42.9 - Gemini Deep Research 26.9 - - - - Kimi-Researcher 26.9 78.8 - - - WebDancer-7B - - - - 36.0 WebSailor-7B - - 6.7 14.2 - MiroThinker-8B-SFT-v0.1 - 58.0 5.5 9.3 41.3 MiroThinker-8B-DPO-v0.1 - 64.4 8.7 13.6 45.7 WebThinker-32B-RL - - - - 46.5 WebDancer-QwQ-32B - - 3.8 18.0 47.9 WebSailor-32B - - 10.5 25.5 - WebShaper-32B - - - - 51.4 MiroThinker-32B-SFT-v0.1 10.2 70.4 10.6 13.8 45.7 MiroThinker-32B-DPO-v0.1 11.8 71.7 13.0 17.0 49.3 MiroThinker’s performance was tested with this repository and open-source tools; other models’ results are from their papers and official sites. As MiroVerse-v0.1 mainly contains English data, the model’s Chinese capability is limited. We plan to add more Chinese data to improve performance in the next version. 🚀 Quick Start Prerequisites 🐍 Python 3.10+ 📦 uv package manager (Installation guide) 🔑 Required API keys (see configuration section below) Installation # Clone the repository git clone https://github.com/MiroMindAI/MiroThinker cd MiroThinker # Setup environment cd apps/miroflow-agent uv sync # Configure API keys cp .env.example .env # Edit .env with your API keys (SERPER_API_KEY, JINA_API_KEY, E2B_API_KEY, etc.) 📝 Environment Variables: See Tool Configuration section for required API keys. Tool Configuration Minimal Configuration for MiroThinker v1.5 and v1.0 Server Description Tools Provided Required Environment Variables tool-python Execution environment and file management (E2B sandbox) create_sandbox, run_command, run_python_code, upload_file_from_local_to_sandbox, download_file_from_sandbox_to_local, download_file_from_internet_to_sandbox E2B_API_KEY search_and_scrape_webpage Google search via Serper API google_search SERPER_API_KEY, SERPER_BASE_URL jina_scrape_llm_summary Web scraping with LLM-based information extraction scrape_and_extract_info JINA_API_KEY, JINA_BASE_URL, SUMMARY_LLM_BASE_URL, SUMMARY_LLM_MODEL_NAME, SUMMARY_LLM_API_KEY Minimal .env configuration example: # Required for MiroThinker v1.5 and v1.0 (minimal setup) SERPER_API_KEY=your_serper_key SERPER_BASE_URL="https://google.serper.dev" JINA_API_KEY=your_jina_key JINA_BASE_URL="https://r.jina.ai" E2B_API_KEY=your_e2b_key # Required for jina_scrape_llm_summary # Note: Summary LLM can be a small model (e.g., Qwen3-14B or GPT-5-Nano) # The choice has minimal impact on performance, use what's most convenient SUMMARY_LLM_BASE_URL="https://your_summary_llm_base_url/v1/chat/completions" SUMMARY_LLM_MODEL_NAME=your_llm_model_name # e.g., "Qwen/Qwen3-14B" or "gpt-5-nano" SUMMARY_LLM_API_KEY=your_llm_api_key # Optional, depends on LLM provider # Required for benchmark evaluation (LLM-as-a-Judge) OPENAI_API_KEY=your_openai_key # Required for running benchmark evaluations OPENAI_BASE_URL="https://api.openai.com/v1" # Optional, defaults to OpenAI's API 💡 Why this is minimal: These 3 MCP servers cover the core capabilities needed for research tasks: web search, content extraction, and code execution. All other servers are optional enhancements. 🤖 Summary LLM: The SUMMARY_LLM can be a small model like Qwen3-14B or GPT-5-Nano. The choice has minimal impact on overall performance, use whichever is most convenient for your setup. 📊 For Benchmark Evaluation: If you plan to run benchmark evaluations, you also need OPENAI_API_KEY (and optionally OPENAI_BASE_URL) for LLM-as-a-Judge functionality used in evaluation scripts. 🖼️ For GAIA Multimodal Tasks: GAIA-Val-165 includes tasks with image/audio/video files. Since MiroThinker is a text-only LLM, GPT-4o is used to pre-process these files into text descriptions. The same OPENAI_API_KEY is used for both this preprocessing and LLM-as-a-Judge. 📖 For more details: See MiroFlow Tools README for complete documentation of all available tools. 🔧 Click to expand additional available tools The following optional tools are available but were not used in MiroThinker v1.5 and v1.0 evaluation: Server Name Type Description tool-vqa Commercial Vision processing using Claude tool-vqa-os Open-Source Vision processing (open-source alternative) tool-transcribe Commercial Audio transcription using OpenAI tool-transcribe-os Open-Source Audio transcription using Whisper tool-reasoning Commercial Reasoning engine using Claude tool-reasoning-os Open-Source Reasoning engine (open-source alternative) tool-reading Open-Source Document reading using MarkItDown tool-google-search Commercial Web search using Google + scraping tool-sougou-search Commercial Web search using Sougou (Chinese) 📖 Local Deployment: For instructions on deploying open-source tools (tool-vqa-os, tool-transcribe-os, tool-reasoning-os) locally, see Local Tool Deployment Guide. See the MiroFlow Tools README for complete documentation of all available tools. Pre-configured Agent Settings The apps/miroflow-agent/conf/agent/ directory contains several pre-configured agent settings. Each configuration uses different tools and requires corresponding environment variables in your .env file. 💡 Recommended: For MiroThinker v1.5, use mirothinker_v1.5_keep5_max200 (with context management, recommended for most tasks) or mirothinker_v1.5_keep5_max400 (only used for BrowseComp and BrowseComp-ZH). For v1.0, use mirothinker_v1.0_keep5 (with context management). All use minimal configuration with only 3 MCP servers. Configuration Description Max Turns Context Retention Required Environment Variables Recommended For mirothinker_v1.5_keep5_max200 ⭐ Single-agent with context management 200 Keep 5 most recent SERPER_API_KEY, SERPER_BASE_URL, JINA_API_KEY, JINA_BASE_URL, E2B_API_KEY, SUMMARY_LLM_BASE_URL, SUMMARY_LLM_MODEL_NAME, SUMMARY_LLM_API_KEY v1.5 (recommended for most tasks) mirothinker_v1.5_keep5_max400 ⭐ Single-agent with context management 400 Keep 5 most recent Same as above v1.5 (for BrowseComp & BrowseComp-ZH) mirothinker_v1.5 Single-agent for MiroThinker v1.5 600 Keep all results Same as above v1.5 mirothinker_v1.0_keep5 Single-agent with context management 600 Keep 5 most recent Same as above v1.0 mirothinker_v1.0 Single-agent for MiroThinker v1.0 600 Keep all results Same as above v1.0 📦 Click to expand legacy configurations (v0.1/v0.2) Configuration Description Max Turns Context Retention Required Environment Variables Recommended For multi_agent Multi-agent with commercial tools (v0.1/v0.2) 50 Keep all results E2B_API_KEY, ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL, OPENAI_API_KEY, OPENAI_BASE_URL, SERPER_API_KEY, SERPER_BASE_URL, JINA_API_KEY, JINA_BASE_URL v0.1/v0.2 multi_agent_os Multi-agent with open-source tools (v0.1/v0.2) 50 Keep all results E2B_API_KEY, VISION_API_KEY, VISION_BASE_URL, VISION_MODEL_NAME, WHISPER_API_KEY, WHISPER_BASE_URL, WHISPER_MODEL_NAME, REASONING_API_KEY, REASONING_BASE_URL, REASONING_MODEL_NAME, SERPER_API_KEY, SERPER_BASE_URL, JINA_API_KEY, JINA_BASE_URL v0.1/v0.2 💡 Note: All environment variables are listed in apps/miroflow-agent/.env.example. Copy it to .env and fill in the values for the tools you plan to use. Creating Custom Tool Configurations 🔧 Click to expand custom tool configuration guide You can create your own YAML configuration file to freely combine MCP servers. Here's how: Create a new YAML file in apps/miroflow-agent/conf/agent/: # conf/agent/my_custom_config.yaml defaults: - default - _self_ main_agent: tools: - tool-python # Execution environment - search_and_scrape_webpage # Google search - jina_scrape_llm_summary # Web scraping with LLM - tool-vqa # Vision processing (optional) - tool-transcribe # Audio processing (optional) - tool-reasoning # Reasoning engine (optional) - tool-reading # Document reading (optional) max_turns: 400 # Maximum number of turns sub_agents: agent-browsing: # Optional sub-agent tools: - tool-google-search - tool-vqa - tool-reading - tool-python max_turns: 50 keep_tool_result: -1 # Context retention budget: -1 keeps all tool results, or specify K to keep only the K most recent tool responses 💡 Context Retention Strategy: The keep_tool_result parameter implements a recency-based context retention strategy. In the standard ReAct paradigm, all tool outputs are retained in the message history, which can lead to inefficient context utilization. Empirically, we observe that the model's subsequent actions depend primarily on recent observations rather than distant ones. This strategy retains only the most recent K tool responses (where K is the keep_tool_result value) while preserving the complete sequence of thoughts and actions. Benefits: ✅ Preserves the reasoning and action trace ✅ Focuses the model's attention on the most contextually relevant observations ✅ Frees additional context space for extended reasoning and deeper tool-use trajectories ✅ Does not lead to performance degradation while allowing more context space for interactive scaling Usage: Set keep_tool_result: -1 to keep all tool results, or specify a positive integer K (e.g., keep_tool_result: 5) to keep only the K most recent tool responses. Use your custom configuration when running evaluations: cd apps/miroflow-agent uv run main.py llm=qwen-3 agent=my_custom_config llm.base_url=https://your_base_url/v1 Configure environment variables in .env based on the tools you use. All available environment variables are listed in apps/miroflow-agent/.env.example. Copy it to .env and configure the variables according to your chosen configuration: cd apps/miroflow-agent cp .env.example .env # Edit .env with your actual API keys For MiroThinker v1.5 (mirothinker_v1.5_keep5_max200.yaml, mirothinker_v1.5_keep5_max400.yaml, or mirothinker_v1.5.yaml) and v1.0 (mirothinker_v1.0_keep5.yaml or mirothinker_v1.0.yaml), see the Minimal Configuration section above for the complete configuration example. For other configurations, refer to the Pre-configured Agent Settings table above to see which environment variables are required. 🔑 Click to expand optional API keys # API for LLM-as-a-Judge (for benchmark testing, required for benchmark evaluation) OPENAI_API_KEY=your_openai_key OPENAI_BASE_URL="https://api.openai.com/v1" # Optional, defaults to OpenAI's API # API for Open-Source Audio Transcription Tool (for benchmark testing, optional) WHISPER_MODEL_NAME="openai/whisper-large-v3-turbo" WHISPER_API_KEY=your_whisper_key WHISPER_BASE_URL="https://your_whisper_base_url/v1" # API for Open-Source VQA Tool (for benchmark testing, optional) VISION_MODEL_NAME="Qwen/Qwen2.5-VL-72B-Instruct" VISION_API_KEY=your_vision_key VISION_BASE_URL="https://your_vision_base_url/v1/chat/completions" # API for Open-Source Reasoning Tool (for benchmark testing, optional) REASONING_MODEL_NAME="Qwen/Qwen3-235B-A22B-Thinking-2507" REASONING_API_KEY=your_reasoning_key REASONING_BASE_URL="https://your_reasoning_base_url/v1/chat/completions" # API for Claude Sonnet 3.7 as Commercial Tools (optional) ANTHROPIC_API_KEY=your_anthropic_key # API for Sougou Search (optional) TENCENTCLOUD_SECRET_ID=your_tencent_cloud_secret_id TENCENTCLOUD_SECRET_KEY=your_tencent_cloud_secret_key # API for Summary LLM (can use small models like Qwen3-14B or GPT-5-Nano) SUMMARY_LLM_BASE_URL="https://your_summary_llm_base_url/v1/chat/completions" SUMMARY_LLM_MODEL_NAME=your_summary_llm_model_name # e.g., "Qwen/Qwen3-14B" or "gpt-5-nano" SUMMARY_LLM_API_KEY=your_summary_llm_api_key Serve the MiroThinker Model Option 1 (Recommended): Serve with SGLang or vLLM Use SGLang to serve MiroThinker models at port 61002: NUM_GPUS=4 PORT=61002 # Downloading model from HF (v1.5 recommended) MODEL_PATH=miromind-ai/MiroThinker-v1.5-30B # Or use v1.0 # MODEL_PATH=miromind-ai/MiroThinker-v1.0-30B python3 -m sglang.launch_server \ --model-path $MODEL_PATH \ --tp $NUM_GPUS \ --dp 1 \ --host 0.0.0.0 \ --port $PORT \ --trust-remote-code 📍 Server URL: This will start a server at http://0.0.0.0:$PORT. Use this as your server base URL (e.g., http://0.0.0.0:61002/v1). Option 2: Quantized Light-Weight Options We also provide comprehensive guidance for serving MiroThinker models using CPU-optimized and GPU-accelerated quantization techniques, along with detailed analysis and guidelines for deployment with llama.cpp, Ollama, SGLang, and other inference frameworks. 📖 Complete Guide: See Deployment Documentation for detailed deployment instructions. Run Your First Task After setting up the environment and starting your model server, run main.py to test with a default question: "What is the title of today's arxiv paper in computer science?" cd apps/miroflow-agent # Using MiroThinker models (requires your own model server) uv run python main.py llm=qwen-3 agent=mirothinker_v1.5_keep5_max200 llm.base_url=http://localhost:61002/v1 # Or using Claude (requires ANTHROPIC_API_KEY in .env) uv run python main.py llm=claude-3-7 agent=single_agent_keep5 # Or using GPT-5 (requires OPENAI_API_KEY in .env) uv run python main.py llm=gpt-5 agent=single_agent_keep5 To customize your question, edit main.py line 32: task_description = "Your custom question here" The agent will search the web, execute code if needed, and provide an answer with sources. 📖 More details: See apps/miroflow-agent/README.md for available configurations and troubleshooting. 📊 Benchmark Evaluation For researchers who want to reproduce our benchmark results or evaluate on standard benchmarks. Download Benchmark Data cd MiroThinker # Back to project root wget https://huggingface.co/datasets/miromind-ai/MiroFlow-Benchmarks/resolve/main/data_20251115_password_protected.zip unzip data_20251115_password_protected.zip # Password: pf4* rm data_20251115_password_protected.zip Run Benchmark Evaluation Note: For MiroThinker v1.5, use mirothinker_v1.5_keep5_max200 (with context management), mirothinker_v1.5_keep5_max400 (with context management), or mirothinker_v1.5 configurations. For v1.0, use mirothinker_v1.0_keep5 (with context management) or mirothinker_v1.0 configurations. Available Parameters: You can customize the evaluation by setting the following environment variables before running the script: Parameter Default Description LLM_MODEL "MiroThinker-Models" Model name identifier BASE_URL "https://your-api.com/v1" Base URL of your model server NUM_RUNS Varies by benchmark Number of evaluation runs (3 for most benchmarks, 8 for GAIA/XBench/FutureX/SEAL-0, 32 for AIME2025) LLM_PROVIDER "qwen" LLM provider (e.g., qwen, openai, anthropic) AGENT_SET "mirothinker_v1.5_keep5_max200" Agent configuration (e.g., mirothinker_v1.5_keep5_max200, mirothinker_v1.5_keep5_max400, mirothinker_v1.0_keep5) MAX_CONTEXT_LENGTH 262144 Maximum context length (256K) MAX_CONCURRENT 10 Maximum concurrent tasks PASS_AT_K 1 Pass@K evaluation metric TEMPERATURE 1.0 Sampling temperature API_KEY "xxx" API key for the model server Example Usage: # Navigate to the miroflow-agent directory first cd apps/miroflow-agent # Basic usage with v1.5 (recommended) NUM_RUNS=8 LLM_MODEL="MiroThinker-v1.5-30B" BASE_URL="https://your-api.com/v1" bash scripts/run_evaluate_multiple_runs_gaia-validation-text-103.sh # Or with v1.0 # NUM_RUNS=8 LLM_MODEL="MiroThinker-v1.0-30B" BASE_URL="https://your-api.com/v1" bash scripts/run_evaluate_multiple_runs_gaia-validation-text-103.sh # Customize number of runs and agent configuration (v1.5 with context management) LLM_MODEL="MiroThinker-v1.5-30B" \ BASE_URL="https://your-api.com/v1" \ NUM_RUNS=8 \ AGENT_SET="mirothinker_v1.5_keep5_max200" \ bash scripts/run_evaluate_multiple_runs_gaia-validation-text-103.sh # Or with v1.0 configuration (with context management) # LLM_MODEL="MiroThinker-v1.0-30B" \ # BASE_URL="https://your-api.com/v1" \ # NUM_RUNS=8 \ # AGENT_SET="mirothinker_v1.0_keep5" \ # bash scripts/run_evaluate_multiple_runs_gaia-validation-text-103.sh 📋 Click to expand all benchmark commands ⚠️ Important for MiroThinker v1.5: To reproduce our reported results, you must set the correct AGENT_SET: BrowseComp & BrowseComp-ZH: Use AGENT_SET="mirothinker_v1.5_keep5_max400" All other benchmarks: Use AGENT_SET="mirothinker_v1.5_keep5_max200" # Navigate to the miroflow-agent directory first cd apps/miroflow-agent # HLE NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_hle.sh # HLE-Text-2158 NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_hle-text-2158.sh # HLE-Text-500 NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_hle-text-500.sh # GAIA-Text-103 NUM_RUNS=8 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_gaia-validation-text-103.sh # GAIA-Validation (GAIA-Val-165) NUM_RUNS=8 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_gaia-validation.sh # BrowseComp-EN (⚠️ use max400) NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max400" bash scripts/run_evaluate_multiple_runs_browsecomp.sh # BrowseComp-ZH (⚠️ use max400) NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max400" bash scripts/run_evaluate_multiple_runs_browsecomp_zh.sh # WebWalkerQA NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_webwalkerqa.sh # XBench-DeepSearch NUM_RUNS=8 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_xbench_deepsearch.sh # FRAMES NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_frames.sh # SEAL-0 NUM_RUNS=8 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_seal-0.sh # FutureX NUM_RUNS=8 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_futurex.sh # AIME2025 NUM_RUNS=32 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_aime2025.sh # DeepSearchQA NUM_RUNS=3 LLM_MODEL="xxx" BASE_URL="xxx" AGENT_SET="mirothinker_v1.5_keep5_max200" bash scripts/run_evaluate_multiple_runs_deepsearchqa.sh 3. Monitor evaluation progress 📊 Click to expand progress monitoring commands # Navigate to the miroflow-agent directory first cd apps/miroflow-agent # For HLE python benchmarks/check_progress/check_progress_hle.py /path/to/evaluation/logs # For HLE-Text-2158 python benchmarks/check_progress/check_progress_hle-text-2158.py /path/to/evaluation/logs # For HLE-Text-500 python benchmarks/check_progress/check_progress_hle-text-500.py /path/to/evaluation/logs # For BrowseComp-EN python benchmarks/check_progress/check_progress_browsecomp.py /path/to/evaluation/logs # For BrowseComp-ZH python benchmarks/check_progress/check_progress_browsecomp_zh.py /path/to/evaluation/logs # For GAIA-Validation python benchmarks/check_progress/check_progress_gaia-validation.py /path/to/evaluation/logs # For GAIA-Text-103 python benchmarks/check_progress/check_progress_gaia-validation-text-103.py /path/to/evaluation/logs # For WebWalkerQA python benchmarks/check_progress/check_progress_webwalkerqa.py /path/to/evaluation/logs # For Frames python benchmarks/check_progress/check_progress_frames.py /path/to/evaluation/logs # For XBench-DeepSearch python benchmarks/check_progress/check_progress_xbench_deepsearch.py /path/to/evaluation/logs # For SEAL-0 python benchmarks/check_progress/check_progress_seal-0.py /path/to/evaluation/logs # For AIME2025 python benchmarks/check_progress/check_progress_aime2025.py /path/to/evaluation/logs # For DeepSearchQA python benchmarks/check_progress/check_progress_deepsearchqa.py /path/to/evaluation/logs 🔬 Trace Collection 📋 Click to expand trace collection commands cd apps/collect-trace # Collect Traces for SFT bash scripts/collect_trace_claude37.sh bash scripts/collect_trace_gpt5.sh # Collect Traces for DPO bash scripts/collect_trace_qwen3.sh ❓ FAQ & Troubleshooting Common Issues 🔧 Click to expand troubleshooting guide Q: Which version should I use? A: We recommend MiroThinker v1.5 ⭐ with the minimal configuration: v1.5 ⭐: Latest version with 256K context, world-leading performance. Use config (with context management): mirothinker_v1.5_keep5_max200 (up to 200 turns, recommended for most tasks) mirothinker_v1.5_keep5_max400 (up to 400 turns, only used for BrowseComp and BrowseComp-ZH) Q: How do I get API keys? A: You need these keys for minimal setup: SERPER_API_KEY: Get from Serper.dev (Google search API) JINA_API_KEY: Get from Jina.ai (Web scraping) E2B_API_KEY: Get from E2B.dev (Code execution sandbox) SUMMARY_LLM_API_KEY: Your LLM API credentials (for content summarization). Can be a small model like Qwen3-14B or GPT-5-Nano—the choice has minimal impact on performance. OPENAI_API_KEY: Get from OpenAI (Required for benchmark evaluation, used for LLM-as-a-Judge) OPENAI_BASE_URL: Optional, defaults to https://api.openai.com/v1. Can be changed to use OpenAI-compatible APIs. Q: Model server connection errors A: Common issues: Check base URL format: Should end with /v1 (e.g., https://your-api.com/v1) Verify API key: Ensure API_KEY is set correctly in environment or script Check server status: Make sure your model server is running and accessible Network issues: Verify firewall/network settings allow connections Q: Evaluation script fails to run A: Troubleshooting steps: Check working directory: Make sure you're in apps/miroflow-agent directory Verify environment: Run uv sync to ensure dependencies are installed Check .env file: Ensure all required environment variables are set Review logs: Check logs/ directory for detailed error messages Verify data path: Ensure benchmark data is downloaded and in correct location Q: Out of memory errors A: Solutions: Reduce context length: Set MAX_CONTEXT_LENGTH to a smaller value (e.g., 131072 for 128K) Use context management with fewer turns: For v1.5: Use mirothinker_v1.5_keep5_max200 or mirothinker_v1.5_keep5_max400 (with context management) For v1.0: Use mirothinker_v1.0_keep5 (with context management) Reduce concurrent tasks: Set MAX_CONCURRENT to a smaller number (e.g., 5) Use smaller model: For v1.5: Try 30B instead of 235B For v1.0: Try 8B or 30B instead of 72B Q: Tool execution errors A: Common fixes: E2B errors: Verify E2B_API_KEY is valid and account has credits Serper errors: Check SERPER_API_KEY and rate limits Jina errors: Verify JINA_API_KEY and JINA_BASE_URL are correct LLM summarization errors: Check SUMMARY_LLM_* variables and model availability Q: How to monitor long-running evaluations? A: Use the progress monitoring scripts: cd apps/miroflow-agent python benchmarks/check_progress/check_progress_<benchmark_name>.py /path/to/logs The scripts show completion status, elapsed time, and estimated remaining time. Getting Help 📖 Documentation: Check MiroFlow Tools README for tool details 💬 Discord: Join our Discord community 🐛 Issues: Report bugs on GitHub Issues 📧 Contact: Visit our website for more information 📄 License This project is licensed under the MIT License - see the LICENSE file for details. 🙏 Acknowledgments We extend our sincere gratitude to: 🏆 Benchmark Contributors for the comprehensive evaluation datasets 🌍 Open Source Community for the tools and libraries that make this possible 👥 All Contributors who have helped make MiroThinker better Join our community and help us build the future of AI agents! References If you find this project useful in your research, please consider citing: @article{miromind2025mirothinker, title={MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling}, author={MiroMind Team and Bai, Song and Bing, Lidong and Chen, Carson and Chen, Guanzheng and Chen, Yuntao and Chen, Zhe and Chen, Ziyi and Dai, Jifeng and Dong, Xuan and others}, journal={arXiv preprint arXiv:2511.11793}, year={2025} }

Lightricks/ComfyUI-LTXVideo

LTX-Video Support for ComfyUIComfyUI-LTXVideo A collection of powerful custom nodes that extend ComfyUI's capabilities for the LTX-2 video generation model. LTX-2 is built into ComfyUI core (see it here), making it readily accessible to all ComfyUI users. This repository hosts additional nodes and workflows to help you get the most out of LTX-2's advanced features. To learn more about LTX-2 See the main LTX-2 repository for model details and additional resources. Prerequisites Before you begin using an LTX-2 workflow in ComfyUI, make sure you have: ComfyUI installed (Download here](https://www.comfy.org/download) CUDA-compatible GPU with 32GB+ VRAM 100GB+ free disk space for models and cache Quick Start 🚀 We recommend using the LTX-2 workflows available in Comfy Manager. Open ComfyUI Click the Manager button (or press Ctrl+M) Select Install Custom Nodes Search for “LTXVideo” Click Install Wait for installation to complete Restart ComfyUI The nodes will appear in your node menu under the “LTXVideo” category. Required models will be downloaded on first use. Example Workflows The ComfyUI-LTXVideo installation includes several example workflows. You can see them all at: ''' ComfyUI/custom_nodes/ComfyUI-LTXVideo/example_workflows/ ''' Text to video full model Text to video distilled model (Fast) Image to video full model Image to video distilled model (Fast) Video to video detailer IC-LoRA distilled model (depth + human pose + edges) Required Models Download the following models: LTX-2 Model Checkpoint - Choose and download one of the models to COMFYUI_ROOT_FOLDER/models/checkpoints folder. ltx-2-19b-dev-fp8.safetensors ltx-2-19b-distilled-fp8.safetensors ltx-2-19b-dev.safetensors ltx-2-19b-distilled.safetensors Spatial Upscaler - Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder. ltx-2-spatial-upscaler-x2-1.0.safetensors Temporal Upscaler - Required for current two-stage pipeline implementations in this repository. Download to COMFYUI_ROOT_FOLDER/models/latent_upscale_models folder. ltx-2-temporal-upscaler-x2-1.0.safetensors Distilled LoRA - Required for current two-stage pipeline implementations in this repository (except DistilledPipeline and ICLoraPipeline). Download to COMFYUI_ROOT_FOLDER/models/loras folder. ltx-2-19b-distilled-lora-384.safetensors Gemma Text Encoder Download all files from the repository to COMFYUI_ROOT_FOLDER/models/text_encoders/gemma-3-12b-it-qat-q4_0-unquantized. Gemma 3 LoRAs Choose and download to COMFYUI_ROOT_FOLDER/models/loras folder. ltx-2-19b-ic-lora-canny-control.safetensors ltx-2-19b-ic-lora-depth-control.safetensors ltx-2-19b-ic-lora-detailer.safetensors ltx-2-19b-ic-lora-pose-control.safetensors ltx-2-19b-lora-camera-control-dolly-in.safetensors ltx-2-19b-lora-camera-control-dolly-left.safetensors ltx-2-19b-lora-camera-control-dolly-out.safetensors ltx-2-19b-lora-camera-control-dolly-right.safetensors ltx-2-19b-lora-camera-control-jib-down.safetensors ltx-2-19b-lora-camera-control-jib-up.safetensors ltx-2-19b-lora-camera-control-static.safetensors Advanced Techniques Low VRAM For systems with low VRAM you can use the model loader nodes from low_vram_loaders.py. Those nodes ensure the correct order of execution and perform the model offloading such that generation fits in 32 GB VRAM. Use --reserve-vram ComfyUI parameter: python -m main --reserve-vram 5 (or other number in GB). For complete information about using LTX-2 models, workflows, and nodes in ComfyUI, please visit our Open Source documentation.

Hugging Face Papers

ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers

Recent advances in video diffusion models have shifted towards transformer-based architectures, achieving state-of-the-art video generation but at the cost of quadratic attention complexity, which severely limits scalability for longer sequences. We introduce ReHyAt, a Recurrent Hybrid Attention mechanism that combines the fidelity of softmax attention with the efficiency of linear attention, enabling chunk-wise recurrent reformulation and constant memory usage. Unlike the concurrent linear-only SANA Video, ReHyAt's hybrid design allows efficient distillation from existing softmax-based models, reducing the training cost by two orders of magnitude to ~160 GPU hours, while being competitive in the quality. Our light-weight distillation and finetuning pipeline provides a recipe that can be applied to future state-of-the-art bidirectional softmax-based models. Experiments on VBench and VBench-2.0, as well as a human preference study, demonstrate that ReHyAt achieves state-of-the-art video quality while reducing attention cost from quadratic to linear, unlocking practical scalability for long-duration and on-device video generation. Project page is available at https://qualcomm-ai-research.github.io/rehyat.

3d ago

PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference

Recently proposed pyramidal models decompose the conventional forward and backward diffusion processes into multiple stages operating at varying resolutions. These models handle inputs with higher noise levels at lower resolutions, while less noisy inputs are processed at higher resolutions. This hierarchical approach significantly reduces the computational cost of inference in multi-step denoising models. However, existing open-source pyramidal video models have been trained from scratch and tend to underperform compared to state-of-the-art systems in terms of visual plausibility. In this work, we present a pipeline that converts a pretrained diffusion model into a pyramidal one through low-cost finetuning, achieving this transformation without degradation in quality of output videos. Furthermore, we investigate and compare various strategies for step distillation within pyramidal models, aiming to further enhance the inference efficiency. Our results are available at https://qualcomm-ai-research.github.io/PyramidalWan.

2d ago

Learning User Preferences Through Interaction for Long-Term Collaboration

As conversational agents accumulate experience collaborating with users, adapting to user preferences is essential for fostering long-term relationships and improving collaboration quality over time. We introduce MultiSessionCollab, a benchmark that evaluates how well agents can learn user preferences and leverage them to improve collaboration quality throughout multiple sessions. To develop agents that succeed in this setting, we present long-term collaborative agents equipped with a memory that persists and refines user preference as interaction experience accumulates. Moreover, we demonstrate that learning signals can be derived from user simulator behavior in MultiSessionCollab to train agents to generate more comprehensive reflections and update their memory more effectively. Extensive experiments show that equipping agents with memory improves long-term collaboration, yielding higher task success rates, more efficient interactions, and reduced user effort. Finally, we conduct a human user study that demonstrates that memory helps improve user experience in real-world settings.

5d ago

Multi-Scale Local Speculative Decoding for Image Generation

Autoregressive (AR) models have achieved remarkable success in image synthesis, yet their sequential nature imposes significant latency constraints. Speculative Decoding offers a promising avenue for acceleration, but existing approaches are limited by token-level ambiguity and lack of spatial awareness. In this work, we introduce Multi-Scale Local Speculative Decoding (MuLo-SD), a novel framework that combines multi-resolution drafting with spatially informed verification to accelerate AR image generation. Our method leverages a low-resolution drafter paired with learned up-samplers to propose candidate image tokens, which are then verified in parallel by a high-resolution target model. Crucially, we incorporate a local rejection and resampling mechanism, enabling efficient correction of draft errors by focusing on spatial neighborhoods rather than raster-scan resampling after the first rejection. We demonstrate that MuLo-SD achieves substantial speedups - up to 1.7times - outperforming strong speculative decoding baselines such as EAGLE-2 and LANTERN in terms of acceleration, while maintaining comparable semantic alignment and perceptual quality. These results are validated using GenEval, DPG-Bench, and FID/HPSv2 on the MS-COCO 5k validation split. Extensive ablations highlight the impact of up-sampling design, probability pooling, and local rejection and resampling with neighborhood expansion. Our approach sets a new state-of-the-art in speculative decoding for image synthesis, bridging the gap between efficiency and fidelity.

2d ago

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Behavior cloning is enjoying a resurgence in popularity as scaling both model and data sizes proves to provide a strong starting point for many tasks of interest. In this work, we introduce an open recipe for training a video game playing foundation model designed for inference in realtime on a consumer GPU. We release all data (8300+ hours of high quality human gameplay), training and inference code, and pretrained checkpoints under an open license. We show that our best model is capable of playing a variety of 3D video games at a level competitive with human play. We use this recipe to systematically examine the scaling laws of behavior cloning to understand how the model's performance and causal reasoning varies with model and data scale. We first show in a simple toy problem that, for some types of causal reasoning, increasing both the amount of training data and the depth of the network results in the model learning a more causal policy. We then systematically study how causality varies with the number of parameters (and depth) and training steps in scaled models of up to 1.2 billion parameters, and we find similar scaling results to what we observe in the toy problem.

3d ago

Product Hunt

Burner Note

Zero knowledge self destructing notes Discussion | Link

Primer

VPN Peek

See if your VPN is actually working from your menu bar Discussion | Link

anhphong

Pane

The AI that works directly in your spreadsheet's grid Discussion | Link

Rohan Bajpai

Warbl

Boost engagement and grow your audience effortlessly on X Discussion | Link

Alex

Settle It

Quick polls to settle small decisions Discussion | Link

Karl Hills

Show HN: Librario, a book metadata API that aggregates G Books, ISBNDB, and more

4218·2h ago

Theo - t3․gg

No items available

Monthly Post: SaaS Deals + Offers

This is a monthly post where SaaS founders can offer deals/discounts on their products. For sellers (SaaS people) There is no required format for posting, but make an effort to clearly present the deal/offer. It's in your interest to get people to make use of this! State what's in it for the buyer State limits Be transparent Posts with no offers/deals are not permitted. This is not meant for blank self-promo For buyers Do your research. We cannot guarantee/vouch for the posters Inform others: drop feedback if you're interacting with any promotion - comments and votes submitted by /u/AutoModerator [link] [comments]

/u/AutoModerator·Oct 24

Monthly Post: SaaS Deals + Offers

/u/AutoModerator·2w ago

What are you working on today? Drop your SaaS

submitted by /u/Original_Mortgage484 [link] [comments]

/u/Original_Mortgage484·14h ago

At what size does fractional support stop making sense

I run a growing company and we just crossed 60 people. Admin volume feels heavier than expected. Scheduling finance coordination hiring onboarding documentation client communications. I have been using part time help and automation and it works but I can feel the ceiling approaching. I keep hearing that once you hit a certain size fractional support stops being enough but I'm hesitant to increase headcount. Curious where others felt that shift and what you moved to instead. submitted by /u/Ron_Gamer [link] [comments]

/u/Ron_Gamer·4h ago

At what stage did you invest in specialized executive-level hires for your growth?

We're a boutique search and talent firm focused on consumer brands, and as we've grown steadily, the need for top-tier leadership has become more apparent. Our founder prefers focusing on strategic growth rather than high-volume recruiting, so we're weighing between building in-house leadership or partnering with a specialized executive search firm. For small-to-mid companies, what made the difference for you in choosing a strategic recruitment partner for executive roles? submitted by /u/Droid_Nx [link] [comments]

/u/Droid_Nx·4h ago

📣✅New Human Verification System for our subreddit!

Hey everyone, I'm here to tell you about a new human-verification system that we are going to add to our subreddit. This will help us differentiate between bots and real people. You know how annoying these AI bots are right now? This is being done to fight spam and make your time in this community worth it. So, how are we doing this? We’re collaborating with the former CTO of Reddit (u/mart2d2) to beta test a product he is building called VerifyYou, which eliminates unwanted bots, slop, spam and stops ban evasion, so conversations here stay genuinely human. The human verification is anonymous, fast, and free: you look at your phone camera, the system checks liveness to confirm you’re a real person and creates an anonymous hash of your facial shape (just a numerical make-up of your face shape), which helps prevent duplicate or alt accounts, no government ID or personal documents needed or shared. Once you’re verified, you’ll see a “Human Verified Fair/Strong” flair next to your username so people know they’re talking to a real person. How to Verify (2 Minutes) Download & Sign Up: Install the VerifyYou app (Download here) and create your profile. Request Verification: Comment the !verifyme command on this post Connect Account: Check your Reddit DMs. You will receive a message from u/VerifyYouBot. You must accept the chat request if prompted. Click the link in the DM. Tap the button on the web page (or scan the QR code on desktop) to launch the "Connect" screen inside the VerifyYou app. Share Humanness: Follow the prompts to scan your face (this generates a private hash). Click "Share" and your flair will update automatically in your sub! Please share your feedback ( also, the benefits of verifying yourself) Currently, this verification system gives you a Verified Human Fair/Strong, but it doesn't prevent unverified users from posting. We are keeping this optional in the beginning to get your feedback and suggestions for improvement in the verification process. To reward you for verifying, you will be allowed to comment on the Weekly Self Promotion threads we are going to start soon (read this announcement for more info), and soon your posts will be auto-approved if you're verified. Once we are confident, we will implement strict rules of verification before posting or commenting. Please follow the given steps, verify for yourself, note down any issues you face, and share them with us in the comments if you feel something can be improved. Message from the VerifyYou Team The VerifyYou team welcomes your feedback, as they're still in beta and iterating quickly. If you'd like to chat directly with them and help improve the flow, feel free to DM me or reach out to u/mart2d2 directly. We're excited to help bring back that old school Reddit vibe where all users can have a voice without needing a certain amount of karma or account history. Learn more about how VerifyYou proves you're human and keeps you anonymous at r/verifyyou. Thank you for helping keep this sub authentic, high quality, and less bot-ridden. submitted by /u/prakhartiwari0 [link] [comments]

/u/prakhartiwari0·Dec 11

NEW RULES for the IndieHackers subreddit. - Getting the quality back.

Howdy. We had some internal talks, and after looking at the current state of subreddits in the software and SaaS space, we decided to implement an automoderator that will catch bad actors and either remove their posts or put them on a cooldown. We care about this subreddit and the progress that has been made here. Sadly, the moment any community introduces benefits or visibility, it attracts people who want to game the system. We want to stay ahead of that. We would like you to suggest what types of posts should not be allowed and help us identify the grey areas that need rules. Initial Rule Set 1. MRR Claims Require Verification Posts discussing MRR will be auto-reported to us. If we do not see any form of confirmation for the claim, the post will be removed. Most SaaS apps use Stripe. Stripe now provides shareable links for live data. Screenshots will be allowed in edge cases. 2. Posting About Other Companies If your post discusses another company and you are not part of it, you are safe as long as it is clearly an article or commentary, not self-promotion disguised as analysis. 3. Karma Farming Formats Low-effort karma-bait threads such as: “What are you building today?” “We built XYZ.” “It's showcase day of the week share what you did.” …will not be tolerated. Repeated offenses will result in a ban. 4. Fake Q&A Self-Promotion Creating fake posts on one account and replying with another to promote your product will not be tolerated. 5. Artificial Upvoting Botting upvotes is an instant ticket to Azkaban. If a low-effort post has 50 upvotes and 1 comment, you're going on a field trip. Self-Promotion Policy We acknowledge that posting your tool in the dumping ground can be valuable because some users genuinely browse those threads. For that reason, we will likely introduce a weekly self-promotion thread with rules such as: Mandatory engagement with previous links (so the thread stays meaningful instead of becoming a dumping ground). Community Feedback Needed We want your thoughts: What behavior should be moderated? What types of posts should be removed? What examples of problematic post titles should the bot detect? Since bots work by reading strings, example titles would be extremely helpful. Also please report sus posts when you see it (with a reason) submitted by /u/Numerous_Branch5893 [link] [comments]

/u/Numerous_Branch5893·Dec 10

Made $100k with my SaaS in 12 months. Here’s what worked and what didn't

12 months after launching my SaaS it crossed $100k in total revenue. This was the third project of mine and a ton of work went into it. It took me months to learn some important lessons and I thought I’d share just a few of them now to give you a chance to learn faster from what worked for me. For context, my SaaS is focused on product planning and development. What worked: Reaching out to influencers with organic traffic and sponsoring them: I knew good content leads to people trying my app but I didn’t have time to write content all the time so the next natural step was to pay people to post content for me. I just doubled down on what already worked. Removing all formatting from my emails: I thought emails that use company branding felt impersonal and that must impact how many people actually read them. After removing all formatting from my emails my open rate almost doubled. An unexpected win for me. Word of mouth: I always spend most of my time improving the product. My goal is to surprise users with how good the product is, and that naturally leads to them recommending the product to their friends. More than 1/3 of my paying customers come from word of mouth. Building in public to get initial traction: I got my first users by posting on X (build in public and startup communities). I would post my wins, updates, lessons learned, and the occasional meme. In the beginning you only need a few users and every post/reply gives you a chance to reach someone. What didn’t work: Writing articles and trying to rank on Google: Turns out my product isn’t something people are searching for on Google. SEO clearly works for some products, it just wasn’t the right channel for mine. Affiliate system: I’ve had an affiliate system live for months now and I get a ton of applications but it’s extremely rare that an affiliate will actually follow through on their plans. 99% get 0 sign ups. Building features no one wants (obviously): I’ve wasted a few weeks here and there when I built out features that no one really wanted. I strongly recommend you talk to your users and really try to understand them, what they want to achieve, and what’s blocking them, before building out new features. These are just a few lessons I had top of mind, I hope sharing them helps! submitted by /u/felixheikka [link] [comments]

/u/felixheikka·10h ago

what's your tech and ops stack?

what do you use for ruining and operating your business? I'll go first db + auth supabase frontend vuejs + tailwindcss landing page astrojs payment stripe or polars backend golang on hetzner AI provider mix of claude, chatgpt & gemini design figma crawler apify codebase github + github actions dns cloudflare CDN netlify or github pages analytics pirsch or posthog distribution YouTube, X, LinkedIn, Reddit + instantly SEO ahrefs CRM folk love to see what you use on a day to day basis especially names that are not well known but have proven very valuable to you submitted by /u/Odd_Awareness_6935 [link] [comments]

/u/Odd_Awareness_6935·8h ago

Don't skip validating your ideas, its the worst

I have been seeing many founders trying to get better at validating ideas before building which is great, its what we should do, but that sadly doesnt make it easy. I madde a post recently asking about what issues founders have with assessing demand and getting those first beta testers. What surprised me was how consistent the frustrations were. People are not struggling to come up with questions. They are struggling to find a small number of people who actually care enough to reply honestly. A few things I heard over and over: - Talking to 5 to 10 relevant people beats surveying 100 loosely related ones - Scraping posts or blasting outreach quickly turns into noise - Context matters more than volume. What someone tried, what failed, and why they are frustrated You want someone actively searching for the solution, not mentioning a keyword here or there. That feedback reinforced how I was thinking about leverage at the idea stage. It feels less about speed and automation, and more about helping founders notice the right people and approach them intentionally. I've reflected that thinking into this waitlist for the tool I am building to solve this. The landing page explains the approach I aim to take. If you are struggling with early validation, I would genuinely like to know if this seems beneficial or feels off. What direction should I take this? submitted by /u/unkno0wn_dev [link] [comments]

/u/unkno0wn_dev·1d ago

As the year wraps up: what’s the project you’re most proud of building and why?

Like the title says, instead of what you built or how much money it made, I’m curious what project you’re most proud of this year and why. Could be a client site, a personal project, something that never launched, or something that made £0. Any lessons learned? Would love to read a few reflections as the year wraps up. submitted by /u/SheriffRat [link] [comments]

/u/SheriffRat·3w ago

Share your Not-AI projects

I miss seeing original ideas that aren’t just another AI wrapper. If you’re building something in 2025 that’s not AI-related here’s your space to self-promote. Drop your project here submitted by /u/MembershipEuphoric38 [link] [comments]

/u/MembershipEuphoric38·Oct 19

I mapped 8,500+ battles from 1500 BC to present day. Select any war and watch it unfold point-by-point like a guided tour

History class would've hit different if something like this existed when I was a student. I built an interactive world map of battles throughout human history. You can: Pick a specific war Watch it unfold across the map like a guided tour Jump from battle to battle with full details on each one 8,500+ battles. Animated timeline. Works from 1500 BC to today. Would love feedback, especially on what wars or conflicts you'd want to explore first. EDIT: It's live now. https://waratlas.vercel.app/ submitted by /u/Savings-History-8563 [link] [comments]

/u/Savings-History-8563·8h ago

Lost a potential client because our checkout crashed during the demo

I had the best demo of my life yesterday. The client was nodding along. Asking good questions. Ready to sign. Then I clicked the checkout to show them the purchase flow and got a spinner that lasted 47 seconds. It felt like 47 years. I said "this has never happened before" which is the startup equivalent of the dog ate my homework. We test manually before big demos but clearly that's not cutting it anymore. Four person team and none of us are QA engineers so testing always gets deprioritized for feature work. Spent last night looking into automated testing options. There's tools now where you describe what to test in plain English instead of writing code. Momentic, Playwright, a few others. Trying to figure out what actually makes sense for a small team that can't dedicate weeks to learning a framework. Anyway they said they'll circle back next quarter which we all know means we lost them. Expensive lesson learned I guess. submitted by /u/Wild-Nail4873 [link] [comments]

/u/Wild-Nail4873·13h ago

If you're looking for feedback/tester for your product, I'm happy to record my screen while using them and provide the screen recorded sessions

Drop your products below and what you need feedback on and I'll get to as many as I can and get other people to review. This is free. Full disclosure: I'll be using Reveal to do this and I also hope this demonstrates value. If you're open to it, adding your product on Reveal makes it easier for myself and other people to review and for you to get consolidated feedback. If not, just drop your product url below. What I hope to get from this: How useful is the feedback and the feedback format to you. submitted by /u/Icy_Friendship_4597 [link] [comments]

/u/Icy_Friendship_4597·7h ago

Announcing LocalLlama discord server & bot!

INVITE: https://discord.gg/rC922KfEwj There used to be one old discord server for the subreddit but it was deleted by the previous mod. Why? The subreddit has grown to 500k users - inevitably, some users like a niche community with more technical discussion and fewer memes (even if relevant). We have a discord bot to test out open source models. Better contest and events organization. Best for quick questions or showcasing your rig! submitted by /u/HOLUPREDICTIONS [link] [comments]

/u/HOLUPREDICTIONS·Aug 13

Model: cerebras/GLM-4.7-REAP-268B-A32B incoming!

Can't wait! https://huggingface.co/cerebras/GLM-4.7-REAP-268B-A32B submitted by /u/LegacyRemaster [link] [comments]

/u/LegacyRemaster·2h ago

Visualizing RAG, PART 2- visualizing retrieval

Edit: code is live at https://github.com/CyberMagician/Project_Golem Still editing the repository but basically just download the requirements (from requirements txt), run the python ingest to build out the brain you see here in LanceDB real quick, then launch the backend server and front end visualizer. Using UMAP and some additional code to visualizing the 768D vector space of EmbeddingGemma:300m down to 3D and how the RAG “thinks” when retrieving relevant context chunks. How many nodes get activated with each query. It is a follow up from my previous post that has a lot more detail in the comments there about how it’s done. Feel free to ask questions I’ll answer when I’m free submitted by /u/Fear_ltself [link] [comments]

/u/Fear_ltself·9h ago

I made a website to turn any confusing UI into a step-by-step guide via screen sharing (open source)

I built Screen Vision, an open source website that guides you through any task by screen sharing with AI. Privacy Focused: Your screen data is never stored or used to train models. Local LLM Support: If you don't trust cloud APIs, the app has a "Local Mode" that connects to local AI models running on your own machine. Your data never leaves your computer. Web-Native: No desktop app or extension required. Works directly on your browser. How it works: Instruction & Grounding: The system uses GPT-5.2 to determine the next logical step based on your goal and current screen state. These instructions are then passed to Qwen 3VL (30B), which identifies the exact screen coordinates for the action. Visual Verification: The app monitors your screen for changes every 200ms using a pixel-comparison loop. Once a change is detected, it compares before and after snapshots using Gemini 3 Flash to confirm the step was completed successfully before automatically moving to the next task. Source Code: https://github.com/bullmeza/screen.vision Demo: https://screen.vision I’m looking for feedback, please let me know what you think! submitted by /u/bullmeza [link] [comments]

/u/bullmeza·7h ago

Jensen Huang at CES on how open models have really revolutionized AI last year. “When AI is open, it proliferates everywhere.”

From NVIDIA AI on 𝕏: https://x.com/NVIDIAAI/status/2009731908888895516 submitted by /u/Nunki08 [link] [comments]

/u/Nunki08·15h ago

Weather

Clear Sky

Feels like -1°

2am

4am

6am

8am

10am

12pm

2pm

4pm

6pm

8pm

10pm

12am

Shanghai, Shanghai

GitHub Trending

anthropics/claude-code

anomalyco/opencode

netbirdio/netbird

MiroMindAI/MiroThinker

Lightricks/ComfyUI-LTXVideo

Hugging Face Papers

ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers

PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference

Learning User Preferences Through Interaction for Long-Term Collaboration

Multi-Scale Local Speculative Decoding for Image Generation

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Product Hunt

Burner Note

VPN Peek

Pane

Warbl

Settle It

Show HN: Librario, a book metadata API that aggregates G Books, ISBNDB, and more

Finding and fixing Ghostty's largest memory leak

Show HN: I used Claude Code to discover connections between 100 books

Open Chaos: A self-evolving open-source project

Show HN: Play poker with LLMs, or watch them play against each other

Finding and Fixing Ghostty's Largest Memory Leak

How I use Jujutsu

jj tug

OpenChaos.dev

Pulling a New Proof from Knuth’s Fixed-Point Printer

YouTube

The Biggest AI News Updates Were NOT at CES

Anthropic just burned so much trust...

The Tailwind drama

How To Grow An Audience If You Have 0 Followers

I moved off of Next.js

概率的傲慢与赔率的智慧：《随机漫步的傻瓜》与《黑天鹅》述评

I’m addicted to Claude Code (i get it now)

I can't believe he was right.

You're logging wrong [FIXED]

两代读书人的对谈：我们这一代公知 @routangseng

How I'd build a one-person business (if I started over in 2026)

2025: The year I stopped writing code

大灭绝时代的生存指南：塔勒布思想述评

How I parsed billions of rows for every user in 2 seconds

Nvidia's $20B Loophole Explained

金融的底层是武力：为何明治维新能造就财阀，而大清首富只能被割韭菜？

OpenAI: Trapped in 2nd place

How to fix your entire life in 1 day

The Nvidia Groq Acquisition Explained

It was a wild year for CSS

Monthly Post: SaaS Deals + Offers

Monthly Post: SaaS Deals + Offers

What are you working on today? Drop your SaaS

At what size does fractional support stop making sense

At what stage did you invest in specialized executive-level hires for your growth?

📣✅New Human Verification System for our subreddit!

NEW RULES for the IndieHackers subreddit. - Getting the quality back.

Made $100k with my SaaS in 12 months. Here’s what worked and what didn't

what's your tech and ops stack?

Don't skip validating your ideas, its the worst

As the year wraps up: what’s the project you’re most proud of building and why?

Share your ***Not-AI*** projects

I mapped 8,500+ battles from 1500 BC to present day. Select any war and watch it unfold point-by-point like a guided tour

Lost a potential client because our checkout crashed during the demo

If you're looking for feedback/tester for your product, I'm happy to record my screen while using them and provide the screen recorded sessions

Announcing LocalLlama discord server & bot!

Model: cerebras/GLM-4.7-REAP-268B-A32B incoming!

Visualizing RAG, PART 2- visualizing retrieval

I made a website to turn any confusing UI into a step-by-step guide via screen sharing (open source)

Jensen Huang at CES on how open models have really revolutionized AI last year. “When AI is open, it proliferates everywhere.”

Weather

The Changelog

From GitLab to Kilo Code (Interview)

The move faster manifesto (News)

State of the "log" 2025 (Friends)

Agents in the database (Interview)

The code, prose & pods that shaped 2025 (News)

Lex Fridman

#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Share your Not-AI projects