GitHub Trending

基于 Claude Code 的长篇网文辅助创作系统,解决 AI 写作中的「遗忘」和「幻觉」问题,支持 200 万字量级 连载创作。Webnovel Writer 项目简单介绍 Webnovel Writer 是基于 Claude Code 的长篇网文创作系统,目标是降低 AI 写作中的“遗忘”和“幻觉”,支持长周期连载创作。 详细文档已拆分到 docs/: 架构与模块:docs/architecture.md 命令详解:docs/commands.md RAG 与配置:docs/rag-and-config.md 题材模板:docs/genres.md 运维与恢复:docs/operations.md 文档导航:docs/README.md 快速开始 1) 安装插件(官方 Marketplace) claude plugin marketplace add lingfengQAQ/webnovel-writer --scope user claude plugin install webnovel-writer@webnovel-writer-marketplace --scope user 仅当前项目生效时,将 --scope user 改为 --scope project。 2) 安装 Python 依赖 python -m pip install -r https://raw.githubusercontent.com/lingfengQAQ/webnovel-writer/HEAD/requirements.txt 说明:该入口会同时安装核心写作链路与 Dashboard 依赖。 3) 初始化小说项目 在 Claude Code 中执行: /webnovel-init 说明:/webnovel-init 会在当前 Workspace 下按书名创建 PROJECT_ROOT(子目录),并在 workspace/.claude/.webnovel-current-project 写入当前项目指针。 4) 配置 RAG 环境(必做) 进入初始化后的书项目根目录,创建 .env: cp .env.example .env 最小配置示例: EMBED_BASE_URL=https://api-inference.modelscope.cn/v1 EMBED_MODEL=Qwen/Qwen3-Embedding-8B EMBED_API_KEY=your_embed_api_key RERANK_BASE_URL=https://api.jina.ai/v1 RERANK_MODEL=jina-reranker-v3 RERANK_API_KEY=your_rerank_api_key 5) 开始使用 /webnovel-plan 1 /webnovel-write 1 /webnovel-review 1-5 6) 启动可视化面板(可选) /webnovel-dashboard 说明: Dashboard 为只读面板(项目状态、实体图谱、章节/大纲浏览、追读力查看)。 前端构建产物已随插件发布,使用者无需本地 npm build。 7) Agent 模型设置(可选) 本项目所有内置 Agent 默认配置为: model: inherit 表示子 Agent 继承当前 Claude 会话所用模型。 如果要单独给某个 Agent 指定模型,编辑对应文件(webnovel-writer/agents/*.md)的 frontmatter,例如: --- name: context-agent description: ... tools: Read, Grep, Bash model: sonnet --- 常见可选值:inherit / sonnet / opus / haiku(以 Claude Code 当前支持为准)。 更新简介 版本 说明 v5.5.0 (当前) 新增只读可视化 Dashboard Skill(/webnovel-dashboard)与实时刷新能力;支持插件目录启动与预构建前端分发 v5.4.4 引入官方 Plugin Marketplace 安装机制;统一修复 Skills/Agents/References 的 CLI 调用(CLAUDE_PLUGIN_ROOT 单路径,透传命令统一 --) v5.4.3 增强智能 RAG 上下文辅助(auto/graph_hybrid 回退 BM25) v5.3 引入追读力系统(Hook / Cool-point / 微兑现 / 债务追踪) 开源协议 本项目使用 GPL v3 协议,详见 LICENSE。 Star 历史 致谢 本项目使用 Claude Code + Gemini CLI + Codex 配合 Vibe Coding 方式开发。 灵感来源:Linux.do 帖子 贡献 欢迎提交 Issue 和 PR: git checkout -b feature/your-feature git commit -m "feat: add your feature" git push origin feature/your-feature

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.中文 | English 💜 Qwen Chat   |   🤗 Hugging Face   |   🤖 ModelScope   |    📑 Blog    |   📖 Documentation 📊 Benchmark   |   💬 WeChat (微信)   |   🫨 Discord   Qwen-Agent is a framework for developing LLM applications based on the instruction following, tool usage, planning, and memory capabilities of Qwen. It also comes with example applications such as Browser Assistant, Code Interpreter, and Custom Assistant. Now Qwen-Agent plays as the backend of Qwen Chat. News 🔥🔥🔥Feb 16, 2026: Open-sourced Qwen3.5. For usage examples, refer to Qwen3.5 Agent Demo. Jan 27, 2026: Open-sourced agent evaluation benchmark DeepPlanning and added Qwen-Agent documentation. Sep 23, 2025: Added Qwen3-VL Tool-call Demo, supporting tools such as zoom in, image search, and web search. Jul 23, 2025: Add Qwen3-Coder Tool-call Demo; Added native API tool call interface support, such as using vLLM's built-in tool call parsing. May 1, 2025: Add Qwen3 Tool-call Demo, and add MCP Cookbooks. Mar 18, 2025: Support for the reasoning_content field; adjust the default Function Call template, which is applicable to the Qwen2.5 series general models and QwQ-32B. If you need to use the old version of the template, please refer to the example for passing parameters. Mar 7, 2025: Added QwQ-32B Tool-call Demo. It supports parallel, multi-step, and multi-turn tool calls. Dec 3, 2024: Upgrade GUI to Gradio 5 based. Note: GUI requires Python 3.10 or higher. Sep 18, 2024: Added Qwen2.5-Math Demo to showcase the Tool-Integrated Reasoning capabilities of Qwen2.5-Math. Note: The python executor is not sandboxed and is intended for local testing only, not for production use. Getting Started Installation Install the stable version from PyPI: pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]" # Or use `pip install -U qwen-agent` for the minimal requirements. # The optional requirements, specified in double brackets, are: # [gui] for Gradio-based GUI support; # [rag] for RAG support; # [code_interpreter] for Code Interpreter support; # [mcp] for MCP support. Alternatively, you can install the latest development version from the source: git clone https://github.com/QwenLM/Qwen-Agent.git cd Qwen-Agent pip install -e ./"[gui,rag,code_interpreter,mcp]" # Or `pip install -e ./` for minimal requirements. Preparation: Model Service You can either use the model service provided by Alibaba Cloud's DashScope, or deploy and use your own model service using the open-source Qwen models. If you choose to use the model service offered by DashScope, please ensure that you set the environment variable DASHSCOPE_API_KEY to your unique DashScope API key. Alternatively, if you prefer to deploy and use your own model service, please follow the instructions provided in the README of Qwen2 for deploying an OpenAI-compatible API service. Specifically, consult the vLLM section for high-throughput GPU deployment or the Ollama section for local CPU (+GPU) deployment. For the QwQ and Qwen3 model, it is recommended to do not add the --enable-auto-tool-choice and --tool-call-parser hermes parameters, as Qwen-Agent will parse the tool outputs from vLLM on its own. For Qwen3-Coder, it is recommended to enable both of the above parameters, use vLLM's built-in tool parsing, and combine with the use_raw_api parameter usage. Developing Your Own Agent Qwen-Agent offers atomic components, such as LLMs (which inherit from class BaseChatModel and come with function calling) and Tools (which inherit from class BaseTool), along with high-level components like Agents (derived from class Agent). The following example illustrates the process of creating an agent capable of reading PDF files and utilizing tools, as well as incorporating a custom tool: import pprint import urllib.parse import json5 from qwen_agent.agents import Assistant from qwen_agent.tools.base import BaseTool, register_tool from qwen_agent.utils.output_beautify import typewriter_print # Step 1 (Optional): Add a custom tool named `my_image_gen`. @register_tool('my_image_gen') class MyImageGen(BaseTool): # The `description` tells the agent the functionality of this tool. description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.' # The `parameters` tell the agent what input parameters the tool has. parameters = [{ 'name': 'prompt', 'type': 'string', 'description': 'Detailed description of the desired image content, in English', 'required': True }] def call(self, params: str, **kwargs) -> str: # `params` are the arguments generated by the LLM agent. prompt = json5.loads(params)['prompt'] prompt = urllib.parse.quote(prompt) return json5.dumps( {'image_url': f'https://image.pollinations.ai/prompt/{prompt}'}, ensure_ascii=False) # Step 2: Configure the LLM you are using. llm_cfg = { # Use the model service provided by DashScope: 'model': 'qwen-max-latest', 'model_type': 'qwen_dashscope', # 'api_key': 'YOUR_DASHSCOPE_API_KEY', # It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here. # Use a model service compatible with the OpenAI API, such as vLLM or Ollama: # 'model': 'Qwen2.5-7B-Instruct', # 'model_server': 'http://localhost:8000/v1', # base_url, also known as api_base # 'api_key': 'EMPTY', # (Optional) LLM hyperparameters for generation: 'generate_cfg': { 'top_p': 0.8 } } # Step 3: Create an agent. Here we use the `Assistant` agent as an example, which is capable of using tools and reading files. system_instruction = '''After receiving the user's request, you should: - first draw an image and obtain the image url, - then run code `request.get(image_url)` to download the image, - and finally select an image operation from the given document to process the image. Please show the image using `plt.show()`.''' tools = ['my_image_gen', 'code_interpreter'] # `code_interpreter` is a built-in tool for executing code. For configuration details, please refer to the FAQ. files = ['./examples/resource/doc.pdf'] # Give the bot a PDF file to read. bot = Assistant(llm=llm_cfg, system_message=system_instruction, function_list=tools, files=files) # Step 4: Run the agent as a chatbot. messages = [] # This stores the chat history. while True: # For example, enter the query "draw a dog and rotate it 90 degrees". query = input('\nuser query: ') # Append the user query to the chat history. messages.append({'role': 'user', 'content': query}) response = [] response_plain_text = '' print('bot response:') for response in bot.run(messages=messages): # Streaming output. response_plain_text = typewriter_print(response, response_plain_text) # Append the bot responses to the chat history. messages.extend(response) In addition to using built-in agent implementations such as class Assistant, you can also develop your own agent implemetation by inheriting from class Agent. The framework also provides a convenient GUI interface, supporting the rapid deployment of Gradio Demos for Agents. For example, in the case above, you can quickly launch a Gradio Demo using the following code: from qwen_agent.gui import WebUI WebUI(bot).run() # bot is the agent defined in the above code, we do not repeat the definition here for saving space. Now you can chat with the Agent in the web UI. Please refer to the examples directory for more usage examples. FAQ How to Use the Code Interpreter Tool? We implement a code interpreter tool based on local Docker containers. You can enable the built-in code interpreter tool for your agent, allowing it to autonomously write code according to specific scenarios, execute it securely within an isolated sandbox environment, and return the execution results. ⚠️ Note: Before using this tool, please ensure that Docker is installed and running on your local operating system. The time required to build the container image for the first time depends on your network conditions. For Docker installation and setup instructions, please refer to the official documentation. How to Use MCP? You can select the required tools on the open-source MCP server website and configure the relevant environment. Example of MCP invocation format: { "mcpServers": { "memory": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-memory"] }, "filesystem": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/files"] }, "sqlite" : { "command": "uvx", "args": [ "mcp-server-sqlite", "--db-path", "test.db" ] } } } For more details, you can refer to the MCP usage example The dependencies required to run this example are as follows: # Node.js (Download and install the latest version from the Node.js official website) # uv 0.4.18 or higher (Check with uv --version) # Git (Check with git --version) # SQLite (Check with sqlite3 --version) # For macOS users, you can install these components using Homebrew: brew install uv git sqlite3 # For Windows users, you can install these components using winget: winget install --id=astral-sh.uv -e winget install git.git sqlite.sqlite Do you have function calling (aka tool calling)? Yes. The LLM classes provide function calling. Additionally, some Agent classes also are built upon the function calling capability, e.g., FnCallAgent and ReActChat. The current default tool calling template natively supports Parallel Function Calls. How to pass LLM parameters to the Agent? llm_cfg = { # The model name being used: 'model': 'qwen3-32b', # The model service being used: 'model_type': 'qwen_dashscope', # If 'api_key' is not set here, it will default to reading the `DASHSCOPE_API_KEY` environment variable: 'api_key': 'YOUR_DASHSCOPE_API_KEY', # Using an OpenAI API compatible model service, such as vLLM or Ollama: # 'model': 'qwen3-32b', # 'model_server': 'http://localhost:8000/v1', # base_url, also known as api_base # 'api_key': 'EMPTY', # (Optional) LLM hyperparameters: 'generate_cfg': { # This parameter will affect the tool-call parsing logic. Default is False: # Set to True: when content is `<think>this is the thought</think>this is the answer` # Set to False: when response consists of reasoning_content and content # 'thought_in_content': True, # tool-call template: default is nous (recommended for qwen3): # 'fncall_prompt_type': 'nous' # Maximum input length, messages will be truncated if they exceed this length, please adjust according to model API: # 'max_input_tokens': 58000 # Parameters that will be passed directly to the model API, such as top_p, enable_thinking, etc., according to the API specifications: # 'top_p': 0.8 # Using the API's native tool call interface # 'use_raw_api': True, } } How to do question-answering over super-long documents involving 1M tokens? We have released a fast RAG solution, as well as an expensive but competitive agent, for doing question-answering over super-long documents. They have managed to outperform native long-context models on two challenging benchmarks while being more efficient, and perform perfectly in the single-needle "needle-in-the-haystack" pressure test involving 1M-token contexts. See the blog for technical details. Application: BrowserQwen BrowserQwen is a browser assistant built upon Qwen-Agent. Please refer to its documentation for details. Disclaimer The Docker container-based code interpreter mounts only the specified working directory and implements basic sandbox isolation, but it should still be used with caution in production environments.

Select context for coding agents directly from your website React Grab Select context for coding agents directly from your website How? Point at any element and press ⌘C (Mac) or Ctrl+C (Windows/Linux) to copy the file name, React component, and HTML source code. It makes tools like Cursor, Claude Code, Copilot run up to 3× faster and more accurate. Try out a demo! → Install Run this command at your project root (where next.config.ts or vite.config.ts is located): npx -y grab@latest init Connect to MCP npx -y grab@latest add mcp Usage Once installed, hover over any UI element in your browser and press: ⌘C (Cmd+C) on Mac Ctrl+C on Windows/Linux This copies the element's context (file name, React component, and HTML source code) to your clipboard ready to paste into your coding agent. For example: <a class="ml-auto inline-block text-sm" href="#"> Forgot your password? </a> in LoginForm at components/login-form.tsx:46:19 Manual Installation If you're using a React framework or build tool, view instructions below: Next.js (App router) Add this inside of your app/layout.tsx: import Script from "next/script"; export default function RootLayout({ children }) { return ( <html> <head> {process.env.NODE_ENV === "development" && ( <Script src="//unpkg.com/react-grab/dist/index.global.js" crossOrigin="anonymous" strategy="beforeInteractive" /> )} </head> <body>{children}</body> </html> ); } Next.js (Pages router) Add this into your pages/_document.tsx: import { Html, Head, Main, NextScript } from "next/document"; export default function Document() { return ( <Html lang="en"> <Head> {process.env.NODE_ENV === "development" && ( <Script src="//unpkg.com/react-grab/dist/index.global.js" crossOrigin="anonymous" strategy="beforeInteractive" /> )} </Head> <body> <Main /> <NextScript /> </body> </Html> ); } Vite Add this to your index.html: <!doctype html> <html lang="en"> <head> <script type="module"> if (import.meta.env.DEV) { import("react-grab"); } </script> </head> <body> <div id="root"></div> <script type="module" src="/src/main.tsx"></script> </body> </html> Webpack First, install React Grab: npm install react-grab Then add this at the top of your main entry file (e.g., src/index.tsx or src/main.tsx): if (process.env.NODE_ENV === "development") { import("react-grab"); } Plugins React Grab can be extended with plugins. A plugin can add context menu actions, toolbar menu items, lifecycle hooks, and theme overrides. Register a plugin via window.__REACT_GRAB__: window.__REACT_GRAB__.registerPlugin({ name: "my-plugin", hooks: { onElementSelect: (element) => { console.log("Selected:", element.tagName); }, }, }); In React, register inside a useEffect after React Grab loads: useEffect(() => { const api = window.__REACT_GRAB__; if (!api) return; api.registerPlugin({ name: "my-plugin", actions: [ { id: "my-action", label: "My Action", shortcut: "M", onAction: (context) => { console.log("Action on:", context.element); context.hideContextMenu(); }, }, ], }); return () => api.unregisterPlugin("my-plugin"); }, []); Actions use a target field to control where they appear. Omit target (or set "context-menu") for the right-click menu, or set "toolbar" for the toolbar dropdown: actions: [ { id: "inspect", label: "Inspect", shortcut: "I", onAction: (ctx) => console.dir(ctx.element), }, { id: "toggle-freeze", label: "Freeze", target: "toolbar", isActive: () => isFrozen, onAction: () => toggleFreeze(), }, ]; See packages/react-grab/src/types.ts for the full Plugin, PluginHooks, and PluginConfig interfaces. Primitives React Grab provides a set of primitives for building your own mini React Grab. Here's a simple example of how to build your own element selector with hover highlight and one-click inspection: npm install react-grab@latest Then, put this in your React app: import { useState } from "react"; import { getElementContext, freeze, unfreeze, openFile, type ReactGrabElementContext } from "react-grab/primitives"; const useElementSelector = (onSelect: (context: ReactGrabElementContext) => void) => { const [isActive, setIsActive] = useState(false); const startSelecting = () => { setIsActive(true); const highlightOverlay = document.createElement("div"); Object.assign(highlightOverlay.style, { position: "fixed", pointerEvents: "none", zIndex: "999999", border: "2px solid #3b82f6", transition: "all 75ms ease-out", display: "none", }); document.body.appendChild(highlightOverlay); const handleMouseMove = ({ clientX, clientY }: MouseEvent) => { highlightOverlay.style.display = "none"; const target = document.elementFromPoint(clientX, clientY); if (!target) return; const { top, left, width, height } = target.getBoundingClientRect(); Object.assign(highlightOverlay.style, { top: `${top}px`, left: `${left}px`, width: `${width}px`, height: `${height}px`, display: "block", }); }; const handleClick = async ({ clientX, clientY }: MouseEvent) => { highlightOverlay.style.display = "none"; const target = document.elementFromPoint(clientX, clientY); teardown(); if (!target) return; freeze(); onSelect(await getElementContext(target)); unfreeze(); }; const teardown = () => { document.removeEventListener("mousemove", handleMouseMove); document.removeEventListener("click", handleClick, true); highlightOverlay.remove(); setIsActive(false); }; document.addEventListener("mousemove", handleMouseMove); document.addEventListener("click", handleClick, true); }; return { isActive, startSelecting }; }; const ElementSelector = () => { const [context, setContext] = useState<ReactGrabElementContext | null>(null); const selector = useElementSelector(setContext); return ( <div> <button onClick={selector.startSelecting} disabled={selector.isActive}> {selector.isActive ? "Selecting…" : "Select Element"} </button> {context && ( <div> <p>Component: {context.componentName}</p> <p>Selector: {context.selector}</p> <pre>{context.stackString}</pre> <button onClick={() => { const frame = context.stack[0]; if (frame?.fileName) openFile(frame.fileName, frame.lineNumber); }} > Open in Editor </button> </div> )} </div> ); }; See packages/react-grab/src/primitives.ts for the full ReactGrabElementContext, getElementContext, freeze, unfreeze, and openFile primitives. Resources & Contributing Back Want to try it out? Check out our demo. Looking to contribute back? Check out the Contributing Guide. Want to talk to the community? Hop in our Discord and share your ideas and what you've built with React Grab. Find a bug? Head over to our issue tracker and we'll do our best to help. We love pull requests, too! We expect all contributors to abide by the terms of our Code of Conduct. → Start contributing on GitHub License React Grab is MIT-licensed open-source software. Thank you to Andrew Luetgers for donating the grab npm package name.

An AI Hedge Fund TeamAI Hedge Fund This is a proof of concept for an AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions. This project is for educational purposes only and is not intended for real trading or investment. This system employs several agents working together: Aswath Damodaran Agent - The Dean of Valuation, focuses on story, numbers, and disciplined valuation Ben Graham Agent - The godfather of value investing, only buys hidden gems with a margin of safety Bill Ackman Agent - An activist investor, takes bold positions and pushes for change Cathie Wood Agent - The queen of growth investing, believes in the power of innovation and disruption Charlie Munger Agent - Warren Buffett's partner, only buys wonderful businesses at fair prices Michael Burry Agent - The Big Short contrarian who hunts for deep value Mohnish Pabrai Agent - The Dhandho investor, who looks for doubles at low risk Peter Lynch Agent - Practical investor who seeks "ten-baggers" in everyday businesses Phil Fisher Agent - Meticulous growth investor who uses deep "scuttlebutt" research Rakesh Jhunjhunwala Agent - The Big Bull of India Stanley Druckenmiller Agent - Macro legend who hunts for asymmetric opportunities with growth potential Warren Buffett Agent - The oracle of Omaha, seeks wonderful companies at a fair price Valuation Agent - Calculates the intrinsic value of a stock and generates trading signals Sentiment Agent - Analyzes market sentiment and generates trading signals Fundamentals Agent - Analyzes fundamental data and generates trading signals Technicals Agent - Analyzes technical indicators and generates trading signals Risk Manager - Calculates risk metrics and sets position limits Portfolio Manager - Makes final trading decisions and generates orders Note: the system does not actually make any trades. Disclaimer This project is for educational and research purposes only. Not intended for real trading or investment No investment advice or guarantees provided Creator assumes no liability for financial losses Consult a financial advisor for investment decisions Past performance does not indicate future results By using this software, you agree to use it solely for learning purposes. Table of Contents How to Install How to Run ⌨️ Command Line Interface 🖥️ Web Application How to Contribute Feature Requests License How to Install Before you can run the AI Hedge Fund, you'll need to install it and set up your API keys. These steps are common to both the full-stack web application and command line interface. 1. Clone the Repository git clone https://github.com/virattt/ai-hedge-fund.git cd ai-hedge-fund 2. Set up API keys Create a .env file for your API keys: # Create .env file for your API keys (in the root directory) cp .env.example .env Open and edit the .env file to add your API keys: # For running LLMs hosted by openai (gpt-4o, gpt-4o-mini, etc.) OPENAI_API_KEY=your-openai-api-key # For getting financial data to power the hedge fund FINANCIAL_DATASETS_API_KEY=your-financial-datasets-api-key Important: You must set at least one LLM API key (e.g. OPENAI_API_KEY, GROQ_API_KEY, ANTHROPIC_API_KEY, or DEEPSEEK_API_KEY) for the hedge fund to work. Financial Data: Data for AAPL, GOOGL, MSFT, NVDA, and TSLA is free and does not require an API key. For any other ticker, you will need to set the FINANCIAL_DATASETS_API_KEY in the .env file. How to Run ⌨️ Command Line Interface You can run the AI Hedge Fund directly via terminal. This approach offers more granular control and is useful for automation, scripting, and integration purposes. Quick Start Install Poetry (if not already installed): curl -sSL https://install.python-poetry.org | python3 - Install dependencies: poetry install Run the AI Hedge Fund poetry run python src/main.py --ticker AAPL,MSFT,NVDA You can also specify a --ollama flag to run the AI hedge fund using local LLMs. poetry run python src/main.py --ticker AAPL,MSFT,NVDA --ollama You can optionally specify the start and end dates to make decisions over a specific time period. poetry run python src/main.py --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 Run the Backtester poetry run python src/backtester.py --ticker AAPL,MSFT,NVDA Example Output: Note: The --ollama, --start-date, and --end-date flags work for the backtester, as well! 🖥️ Web Application The new way to run the AI Hedge Fund is through our web application that provides a user-friendly interface. This is recommended for users who prefer visual interfaces over command line tools. Please see detailed instructions on how to install and run the web application here. How to Contribute Fork the repository Create a feature branch Commit your changes Push to the branch Create a Pull Request Important: Please keep your pull requests small and focused. This will make it easier to review and merge. Feature Requests If you have a feature request, please open an issue and make sure it is tagged with enhancement. License This project is licensed under the MIT License - see the LICENSE file for details.

Skills Catalog for CodexAgent Skills Agent Skills are folders of instructions, scripts, and resources that AI agents can discover and use to perform at specific tasks. Write once, use everywhere. Codex uses skills to help package capabilities that teams and individuals can use to complete specific tasks in a repeatable way. This repository catalogs skills for use and distribution with Codex. Learn more: Using skills in Codex Create custom skills in Codex Agent Skills open standard Installing a skill Skills in .system are automatically installed in the latest version of Codex. To install curated or experimental skills, you can use the $skill-installer inside Codex. Curated skills can be installed by name (defaults to skills/.curated): $skill-installer gh-address-comments For experimental skills, specify the skill folder. For example: $skill-installer install the create-plan skill from the .experimental folder Or provide the GitHub directory URL: $skill-installer install https://github.com/openai/skills/tree/main/skills/.experimental/create-plan After installing a skill, restart Codex to pick up new skills. License The license of an individual skill can be found directly inside the skill's directory inside the LICENSE.txt file.

Hugging Face Papers

Port congestion at major maritime hubs disrupts global supply chains, yet existing prediction systems typically prioritize forecasting accuracy without providing operationally interpretable explanations. This paper proposes AIS-TGNN, an evidence-grounded framework that jointly performs congestion-escalation prediction and faithful natural-language explanation by coupling a Temporal Graph Attention Network (TGAT) with a structured large language model (LLM) reasoning module. Daily spatial graphs are constructed from Automatic Identification System (AIS) broadcasts, where each grid cell represents localized vessel activity and inter-cell interactions are modeled through attention-based message passing. The TGAT predictor captures spatiotemporal congestion dynamics, while model-internal evidence, including feature z-scores and attention-derived neighbor influence, is transformed into structured prompts that constrain LLM reasoning to verifiable model outputs. To evaluate explanatory reliability, we introduce a directional-consistency validation protocol that quantitatively measures agreement between generated narratives and underlying statistical evidence. Experiments on six months of AIS data from the Port of Los Angeles and Long Beach demonstrate that the proposed framework outperforms both LR and GCN baselines, achieving a test AUC of 0.761, AP of 0.344, and recall of 0.504 under a strict chronological split while producing explanations with 99.6% directional consistency. Results show that grounding LLM generation in graph-model evidence enables interpretable and auditable risk reporting without sacrificing predictive performance. The framework provides a practical pathway toward operationally deployable explainable AI for maritime congestion monitoring and supply-chain risk management.

This paper addresses the problem of Human Activity Recognition (HAR) using data from wearable inertial sensors. An important challenge in HAR is the model's generalization capabilities to new unseen individuals due to inter-subject variability, i.e., the same activity is performed differently by different individuals. To address this problem, we propose a novel deep adversarial framework that integrates the concept of inter-subject variability in the adversarial task, thereby encouraging subject-invariant feature representations and enhancing the classification performance in the HAR problem. Our approach outperforms previous methods in three well-established HAR datasets using a leave-one-subject-out (LOSO) cross-validation. Further results indicate that our proposed adversarial task effectively reduces inter-subject variability among different users in the feature space, and it outperforms adversarial tasks from previous works when integrated into our framework. Code: https://github.com/FranciscoCalatrava/EmbeddedSubjectVariability.git

Deep neural networks have achieved strong performance in image classification tasks due to their ability to learn complex patterns from high-dimensional data. However, their large computational and memory requirements often limit deployment on resource-constrained platforms such as remote sensing devices and edge systems. Network compression techniques have therefore been proposed to reduce model size and computational cost while maintaining predictive performance. In this study, we conduct a systematic evaluation of neural network compression methods for a remote sensing application, namely hyperspectral land cover classification. Specifically, we examine three widely used compression strategies for convolutional neural networks: pruning, quantization, and knowledge distillation. Experiments are conducted on two benchmark hyperspectral datasets, considering classification accuracy, memory consumption, and inference efficiency. Our results demonstrate that compressed models can significantly reduce model size and computational cost while maintaining competitive classification performance. These findings provide insights into the trade-offs between compression ratio, efficiency, and accuracy, and highlight the potential of compression techniques for enabling efficient deep learning deployment in remote sensing applications.

Partial label learning is a prominent weakly supervised classification task, where each training instance is ambiguously labeled with a set of candidate labels. In real-world scenarios, candidate labels are often influenced by instance features, leading to the emergence of instance-dependent PLL (ID-PLL), a setting that more accurately reflects this relationship. A significant challenge in ID-PLL is instance entanglement, where instances from similar classes share overlapping features and candidate labels, resulting in increased class confusion. To address this issue, we propose a novel Class-specific Augmentation based Disentanglement (CAD) framework, which tackles instance entanglement by both intra- and inter-class regulations. For intra-class regulation, CAD amplifies class-specific features to generate class-wise augmentations and aligns same-class augmentations across instances. For inter-class regulation, CAD introduces a weighted penalty loss function that applies stronger penalties to more ambiguous labels, encouraging larger inter-class distances. By jointly applying intra- and inter-class regulations, CAD improves the clarity of class boundaries and reduces class confusion caused by entanglement. Extensive experimental results demonstrate the effectiveness of CAD in mitigating the entanglement problem and enhancing ID-PLL performance. The code is available at https://github.com/RyanZhaoIc/CAD.git.

Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2$\times$ in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.

YouTube

No videos available

No items available

Weather

Partly Cloudy
Feels like 6°
5
2am
5
4am
5
6am
10
8am
6
10am
8
12pm
9
2pm
10
4pm
8
6pm
6
8pm
5
10pm
5
12am
Shanghai, Shanghai