DeepSignal-4B-V1 (GGUF)

This repository provides a GGUF model file for local inference (e.g., llama.cpp / LM Studio). It is intended for traffic-signal-control analysis and related text-generation workflows. For details, check our repository at AIMSLaboratory/DeepSignal.

Files

DeepSignal-4B_V1.F16.gguf
config.json

Quickstart (llama.cpp)

llama-cli -m DeepSignal-4B_V1.F16.gguf -p "You are a traffic management expert. You can use your traffic knowledge to solve the traffic signal control task.
Based on the given traffic {scene} and {state}, predict the next signal phase and its duration.
You must answer directly, the format must be: next signal phase: {number}, duration: {seconds} seconds
where the number is the phase index (starting from 0) and the seconds is the duration (usually between 20-90 seconds)."

You need to input the {scene} (total number of phases, which phases controls which lanes/directions and current phase ID/number, etc) and {state} (number of queing vehicles per lane, throughout vehicles per lane during the current phase, etc)

Evaluation (Traffic Simulation)

Performance Metrics Comparison by Model *

Model	Avg Saturation	Avg Cumulative Queue Length (veh⋅min)	Avg Throughput (veh/5min)	Avg Response Time (s)
`GPT-OSS-20B (thinking)`	0.380	14.088	77.910	6.768
DeepSignal-4B (Ours)	0.422	15.703	79.883	2.131
`Qwen3-30B-A3B`	0.431	17.046	79.059	2.727
`Qwen3-4B`	0.466	57.699	75.712	1.994
Max Pressure	0.465	23.022	77.236	**
`LightGPT-8B-Llama3`	0.523	54.384	75.512	3.025***

*: Each simulation scenario runs for 60 minutes. We discard the first 5 minutes as warm-up, then compute metrics over the next 20 minutes (minute 5 to 25). We cap the evaluation window because, when an LLM controls signal timing for only a single intersection, spillback from neighboring intersections may occur after ~20+ minutes and destabilize the scenario. All evaluations are conducted on a Mac Studio M3 Ultra.
**: Max Pressure is a fixed signal-timing optimization algorithm (not an LLM), so we omit its Avg Response Time; this metric is only defined for LLM-based signal-timing optimization.
***: For LightGPT-8B-Llama3, Avg Response Time is computed using only the successful responses.

Downloads last month: 104

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

16-bit