29 KiB
29 KiB
因果分析日志 (v2)
日志说明
本文档记录 LLM 进行因果分析时的所有输入参数和输出结果。
分析记录
分析 #001
时间: 2026-03-29T19:16:18.924521
系统提示词
你是一位专业的因果推断分析师。你的任务是分析给定的数据,识别因果变量,并对每个变量进行时间层级解析。
请以 JSON 格式输出分析结果,不要包含任何额外的解释或思考过程。
JSON 输出规范:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"变量名1": 整数层级,
"变量名2": 整数层级,
...
}
}
time_tiers 层级说明(整数,越小表示越早发生):
- -1: 非时间变量(如样本唯一标识符 id)
- 0: 人口学特征或不变的混杂因素(如 age, gender, race)
- 1: 基线测量(干预前测得,可能是混杂因素,如 base_health)
- 2: 干预点/处理变量(如 treatment)
- 3: 中介变量(干预后、结果前测得)
- 4: 随访结果/结果变量(如 health)
- 5+: 更晚的时间点(如有多次随访)
注意:
- 只输出上述 JSON 格式,不要包含其他字段
- 处理变量和结果变量名称必须与数据表格的列名完全一致
- time_tiers 必须包含数据中的所有列名
- 不要使用 markdown 代码块标记(如 ```json)
- 直接输出纯 JSON 字符串
用户提示词
请分析以下医疗数据,并严格按照 JSON 格式输出分析结果:
**数据列说明:**
- `id`: 样本唯一标识符
- `treatment`: 是否吃药(0=未吃药,1=吃药)
- `health`: 病人健康状态(0~1 浮点数,越高越好)
- `base_health`: 基准健康状态(未吃药时的健康状态)
- `age`: 年龄(18~70 岁)
**数据概览:**
- 样本数量:500
- 变量:id, treatment, health, base_health, age
**统计摘要:**
id treatment health base_health age
count 500.000000 500.000000 500.000000 500.000000 500.000000
mean 250.500000 0.656000 0.588928 0.414174 44.732000
std 144.481833 0.475517 0.211767 0.175689 15.239707
min 1.000000 0.000000 0.022200 0.012700 18.000000
25% 125.750000 0.000000 0.436300 0.282500 32.000000
50% 250.500000 1.000000 0.585800 0.407950 45.000000
75% 375.250000 1.000000 0.741575 0.534325 57.000000
max 500.000000 1.000000 1.000000 0.902500 70.000000
**处理变量分布:**
treatment
1 328
0 172
JSON 输出格式要求:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"id": -1,
"treatment": 2,
"health": 4,
"base_health": 1,
"age": 0
}
}
要求:
1. 处理变量和结果变量名称必须与表格列名完全一致
2. time_tiers 必须覆盖所有列名
3. 只输出 JSON,不要包含其他任何内容
4. 不要使用 markdown 代码块标记
LLM 输出
{
"treatment": "treatment",
"outcome": "health",
"time_tiers": {
"id": -1,
"treatment": 2,
"health": 4,
"base_health": 1,
"age": 0
}
}
分析报告
{
"query_interpretation": {
"treatment": "treatment",
"outcome": "health",
"estimand": "ATE"
},
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"95%_CI": [
0.2347,
0.2539
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受治疗使患者随访时的 health 平均变化 0.2435 分(95%CI: 0.23-0.25)。"
},
"diagnostics": {
"balance_check": {
"age": {
"before": 0.4104,
"after": 0.0087
},
"base_health": {
"before": -0.3377,
"after": -0.0587
}
},
"overlap_assumption": "满足",
"robustness": "稳健"
},
"warnings": [
{
"type": "unobserved_confounding",
"message": "可能存在未观测混杂(如患者依从性、社会经济地位),建议进行敏感性分析。"
}
]
}
调用参数
{
"data_path": "examples/medical_v2/data.xlsx",
"sample_size": 500,
"variables": [
"id",
"treatment",
"health",
"base_health",
"age"
],
"treatment_variable": "treatment",
"outcome_variable": "health",
"time_tiers": {
"id": -1,
"treatment": 2,
"health": 4,
"base_health": 1,
"age": 0
},
"llm_params": {
"base_url": "http://10.106.123.247:8000/v1",
"model": "qwen3.5-35b",
"temperature": 0.3,
"max_tokens": 2048
},
"candidates": [
{
"var": "base_health",
"pearson_T": -0.1612,
"pearson_Y": 0.7356,
"spearman_T": -0.1505,
"spearman_Y": 0.7267,
"pvalue_T": 0.0007,
"pvalue_Y": 0.0,
"mi_T": 0.0225,
"mi_Y": 1.413
},
{
"var": "age",
"pearson_T": 0.1913,
"pearson_Y": 0.2968,
"spearman_T": 0.1893,
"spearman_Y": 0.3077,
"pvalue_T": 0.0,
"pvalue_Y": 0.0,
"mi_T": 0.0152,
"mi_Y": 0.074
}
],
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"ATE_reported": 0.2435,
"95%_CI": [
0.2347,
0.2539
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受治疗使患者随访时的 health 平均变化 0.2435 分(95%CI: 0.23-0.25)。",
"overlap_assumption": "满足",
"robustness": "稳健"
},
"log_path": "examples/medical_v2/log.md"
}
分析 #002
时间: 2026-03-29T20:00:23.373499
系统提示词
你是一位专业的因果推断分析师。你的任务是分析给定的数据,识别处理变量(treatment)、结果变量(outcome),并对每个变量进行时间层级解析。
请以 JSON 格式输出分析结果,不要包含任何额外的解释或思考过程。
JSON 输出规范:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"变量名1": 整数层级,
"变量名2": 整数层级,
...
}
}
time_tiers 层级说明(整数,越小表示越早发生):
- -1: 非时间变量(如样本唯一标识符 id、index 等)
- 0: 人口学特征或不变的混杂因素(如 age、gender、region 等)
- 1: 基线测量(干预前测得,可能是混杂因素,如 baseline_score、pre_test 等)
- 2: 干预点/处理变量(如 treatment、intervention、policy 等)
- 3: 中介变量(干预后、结果前测得)
- 4: 随访结果/结果变量(如 outcome、post_test、score 等)
- 5+: 更晚的时间点(如有多次随访)
注意:
- 只输出上述 JSON 格式,不要包含其他字段
- treatment 和 outcome 必须是数据表格中真实存在的列名
- time_tiers 必须包含数据中的所有列名
- 不要使用 markdown 代码块标记(如 ```json)
- 直接输出纯 JSON 字符串
用户提示词
请分析以下数据,并严格按照 JSON 格式输出分析结果:
**数据概览:**
- 样本数量:500
- 变量:id, treatment, health, base_health, age
**统计摘要:**
id treatment health base_health age
count 500.000000 500.000000 500.000000 500.000000 500.000000
mean 250.500000 0.656000 0.588928 0.414174 44.732000
std 144.481833 0.475517 0.211767 0.175689 15.239707
min 1.000000 0.000000 0.022200 0.012700 18.000000
25% 125.750000 0.000000 0.436300 0.282500 32.000000
50% 250.500000 1.000000 0.585800 0.407950 45.000000
75% 375.250000 1.000000 0.741575 0.534325 57.000000
max 500.000000 1.000000 1.000000 0.902500 70.000000
**处理变量分布:**
treatment
1 328
0 172
JSON 输出格式要求:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"列名1": 层级整数,
"列名2": 层级整数,
...
}
}
要求:
1. treatment 和 outcome 必须与表格列名完全一致
2. time_tiers 必须覆盖所有列名
3. 根据列名含义和统计摘要推断每个变量的时间层级
4. 只输出 JSON,不要包含其他任何内容
5. 不要使用 markdown 代码块标记
LLM 输出
{
"treatment": "treatment",
"outcome": "health",
"time_tiers": {
"id": -1,
"age": 0,
"base_health": 1,
"treatment": 2,
"health": 4
}
}
分析报告
{
"query_interpretation": {
"treatment": "treatment",
"outcome": "health",
"estimand": "ATE"
},
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"95%_CI": [
0.2349,
0.2535
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受治疗使患者随访时的 health 平均变化 0.2435 分(95%CI: 0.23-0.25)。"
},
"diagnostics": {
"balance_check": {
"age": {
"before": 0.4104,
"after": 0.0087
},
"base_health": {
"before": -0.3377,
"after": -0.0587
}
},
"overlap_assumption": "满足",
"robustness": "稳健"
},
"warnings": [
{
"type": "unobserved_confounding",
"message": "可能存在未观测混杂(如患者依从性、社会经济地位),建议进行敏感性分析。"
}
]
}
调用参数
{
"data_path": "examples/medical_v2/data.xlsx",
"sample_size": 500,
"variables": [
"id",
"treatment",
"health",
"base_health",
"age"
],
"treatment_variable": "treatment",
"outcome_variable": "health",
"time_tiers": {
"id": -1,
"age": 0,
"base_health": 1,
"treatment": 2,
"health": 4
},
"llm_params": {
"base_url": "http://10.106.123.247:8000/v1",
"model": "qwen3.5-35b",
"temperature": 0.3,
"max_tokens": 2048
},
"candidates": [
{
"var": "base_health",
"pearson_T": -0.1612,
"pearson_Y": 0.7356,
"spearman_T": -0.1505,
"spearman_Y": 0.7267,
"pvalue_T": 0.0007,
"pvalue_Y": 0.0,
"mi_T": 0.0225,
"mi_Y": 1.413
},
{
"var": "age",
"pearson_T": 0.1913,
"pearson_Y": 0.2968,
"spearman_T": 0.1893,
"spearman_Y": 0.3077,
"pvalue_T": 0.0,
"pvalue_Y": 0.0,
"mi_T": 0.0152,
"mi_Y": 0.074
}
],
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"ATE_reported": 0.2435,
"95%_CI": [
0.2349,
0.2535
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受治疗使患者随访时的 health 平均变化 0.2435 分(95%CI: 0.23-0.25)。",
"overlap_assumption": "满足",
"robustness": "稳健"
},
"log_path": "examples/medical_v2/log.md"
}
分析 #003
时间: 2026-03-29T22:01:34.463873
系统提示词
你是一位专业的因果推断分析师。你的任务是分析给定的数据,识别处理变量(treatment)、结果变量(outcome),并对每个变量进行时间层级解析。
请以 JSON 格式输出分析结果,不要包含任何额外的解释或思考过程。
JSON 输出规范:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"变量名1": 整数层级,
"变量名2": 整数层级,
...
}
}
time_tiers 层级说明(整数,越小表示越早发生):
- -1: 非时间变量(如样本唯一标识符 id、index 等)
- 0: 人口学特征或不变的混杂因素(如 age、gender、region 等)
- 1: 基线测量(干预前测得,可能是混杂因素,如 baseline_score、pre_test 等)
- 2: 干预点/处理变量(如 treatment、intervention、policy 等)
- 3: 中介变量(干预后、结果前测得)
- 4: 随访结果/结果变量(如 outcome、post_test、score 等)
- 5+: 更晚的时间点(如有多次随访)
注意:
- 只输出上述 JSON 格式,不要包含其他字段
- treatment 和 outcome 必须是数据表格中真实存在的列名
- time_tiers 必须包含数据中的所有列名
- 不要使用 markdown 代码块标记(如 ```json)
- 直接输出纯 JSON 字符串
用户提示词
请分析以下数据,并严格按照 JSON 格式输出分析结果:
**数据概览:**
- 样本数量:500
- 变量:id, treatment, health, base_health, age
**统计摘要:**
id treatment health base_health age
count 500.000000 500.000000 500.000000 500.000000 500.000000
mean 250.500000 0.656000 0.588928 0.414174 44.732000
std 144.481833 0.475517 0.211767 0.175689 15.239707
min 1.000000 0.000000 0.022200 0.012700 18.000000
25% 125.750000 0.000000 0.436300 0.282500 32.000000
50% 250.500000 1.000000 0.585800 0.407950 45.000000
75% 375.250000 1.000000 0.741575 0.534325 57.000000
max 500.000000 1.000000 1.000000 0.902500 70.000000
JSON 输出格式要求:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"列名1": 层级整数,
"列名2": 层级整数,
...
}
}
要求:
1. treatment 和 outcome 必须与表格列名完全一致
2. time_tiers 必须覆盖所有列名
3. 根据列名含义和统计摘要推断每个变量的时间层级
4. 只输出 JSON,不要包含其他任何内容
5. 不要使用 markdown 代码块标记
LLM 输出
{'treatment': 'treatment', 'outcome': 'health', 'time_tiers': {'id': -1, 'treatment': 2, 'health': 4, 'base_health': 1, 'age': 0}}
分析报告
{
"query_interpretation": {
"treatment": "treatment",
"outcome": "health",
"estimand": "ATE"
},
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"95%_CI": [
0.2351,
0.2544
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受处理使 health 平均变化 0.2435 (95%CI: 0.24-0.25)。"
},
"diagnostics": {
"balance_check": {
"age": {
"before": 0.4104,
"after": 0.0087
},
"base_health": {
"before": -0.3377,
"after": -0.0587
}
},
"overlap_assumption": "满足",
"robustness": "稳健"
},
"warnings": [
{
"type": "unobserved_confounding",
"message": "可能存在未观测混杂,建议进行敏感性分析。"
}
]
}
调用参数
{
"data_path": "examples/medical_v2/data.xlsx",
"sample_size": 500,
"variables": [
"id",
"treatment",
"health",
"base_health",
"age"
],
"treatment_variable": "treatment",
"outcome_variable": "health",
"time_tiers": {
"id": -1,
"treatment": 2,
"health": 4,
"base_health": 1,
"age": 0
},
"llm_params": {
"base_url": "http://10.106.123.247:8000/v1",
"model": "qwen3.5-35b",
"temperature": 0.3,
"max_tokens": 2048
},
"candidates": [
{
"var": "base_health",
"pearson_T": -0.1612,
"pearson_Y": 0.7356,
"spearman_T": -0.1505,
"spearman_Y": 0.7267,
"pvalue_T": 0.0007,
"pvalue_Y": 0.0,
"mi_T": 0.0225,
"mi_Y": 1.413
},
{
"var": "age",
"pearson_T": 0.1913,
"pearson_Y": 0.2968,
"spearman_T": 0.1893,
"spearman_Y": 0.3077,
"pvalue_T": 0.0,
"pvalue_Y": 0.0,
"mi_T": 0.0152,
"mi_Y": 0.074
}
],
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"ATE_reported": 0.2435,
"95%_CI": [
0.2351,
0.2544
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受处理使 health 平均变化 0.2435 (95%CI: 0.24-0.25)。",
"overlap_assumption": "满足",
"robustness": "稳健"
},
"log_path": "examples/medical_v2/log.md"
}
分析 #004
时间: 2026-03-29T22:05:26.081133
系统提示词
你是一位专业的因果推断分析师。你的任务是分析给定的数据,识别处理变量(treatment)、结果变量(outcome),并对每个变量进行时间层级解析。
请以 JSON 格式输出分析结果,不要包含任何额外的解释或思考过程。
JSON 输出规范:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"变量名1": 整数层级,
"变量名2": 整数层级,
...
}
}
time_tiers 层级说明(整数,越小表示越早发生):
- -1: 非时间变量(如样本唯一标识符 id、index 等)
- 0: 人口学特征或不变的混杂因素(如 age、gender、region 等)
- 1: 基线测量(干预前测得,可能是混杂因素,如 baseline_score、pre_test 等)
- 2: 干预点/处理变量(如 treatment、intervention、policy 等)
- 3: 中介变量(干预后、结果前测得)
- 4: 随访结果/结果变量(如 outcome、post_test、score 等)
- 5+: 更晚的时间点(如有多次随访)
注意:
- 只输出上述 JSON 格式,不要包含其他字段
- treatment 和 outcome 必须是数据表格中真实存在的列名
- time_tiers 必须包含数据中的所有列名
- 不要使用 markdown 代码块标记(如 ```json)
- 直接输出纯 JSON 字符串
用户提示词
请分析以下数据,并严格按照 JSON 格式输出分析结果:
**数据概览:**
- 样本数量:500
- 变量:id, treatment, health, base_health, age
**统计摘要:**
id treatment health base_health age
count 500.000000 500.000000 500.000000 500.000000 500.000000
mean 250.500000 0.656000 0.588928 0.414174 44.732000
std 144.481833 0.475517 0.211767 0.175689 15.239707
min 1.000000 0.000000 0.022200 0.012700 18.000000
25% 125.750000 0.000000 0.436300 0.282500 32.000000
50% 250.500000 1.000000 0.585800 0.407950 45.000000
75% 375.250000 1.000000 0.741575 0.534325 57.000000
max 500.000000 1.000000 1.000000 0.902500 70.000000
JSON 输出格式要求:
{
"treatment": "处理变量名称",
"outcome": "结果变量名称",
"time_tiers": {
"列名1": 层级整数,
"列名2": 层级整数,
...
}
}
要求:
1. treatment 和 outcome 必须与表格列名完全一致
2. time_tiers 必须覆盖所有列名
3. 根据列名含义和统计摘要推断每个变量的时间层级
4. 只输出 JSON,不要包含其他任何内容
5. 不要使用 markdown 代码块标记
LLM 输出
{'treatment': 'treatment', 'outcome': 'health', 'time_tiers': {'id': -1, 'treatment': 2, 'health': 4, 'base_health': 1, 'age': 0}}
分析报告
{
"query_interpretation": {
"treatment": "treatment",
"outcome": "health",
"estimand": "ATE"
},
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"95%_CI": [
0.2356,
0.254
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受处理使 health 平均变化 0.2435 (95%CI: 0.24-0.25)。"
},
"diagnostics": {
"balance_check": {
"age": {
"before": 0.4104,
"after": 0.0087
},
"base_health": {
"before": -0.3377,
"after": -0.0587
}
},
"overlap_assumption": "满足",
"robustness": "稳健"
},
"warnings": [
{
"type": "unobserved_confounding",
"message": "可能存在未观测混杂,建议进行敏感性分析。"
}
]
}
调用参数
{
"data_path": "examples/medical_v2/data.xlsx",
"sample_size": 500,
"variables": [
"id",
"treatment",
"health",
"base_health",
"age"
],
"treatment_variable": "treatment",
"outcome_variable": "health",
"time_tiers": {
"id": -1,
"treatment": 2,
"health": 4,
"base_health": 1,
"age": 0
},
"llm_params": {
"base_url": "http://10.106.123.247:8000/v1",
"model": "qwen3.5-35b",
"temperature": 0.3,
"max_tokens": 2048
},
"candidates": [
{
"var": "base_health",
"pearson_T": -0.1612,
"pearson_Y": 0.7356,
"spearman_T": -0.1505,
"spearman_Y": 0.7267,
"pvalue_T": 0.0007,
"pvalue_Y": 0.0,
"mi_T": 0.0225,
"mi_Y": 1.413
},
{
"var": "age",
"pearson_T": 0.1913,
"pearson_Y": 0.2968,
"spearman_T": 0.1893,
"spearman_Y": 0.3077,
"pvalue_T": 0.0,
"pvalue_Y": 0.0,
"mi_T": 0.0152,
"mi_Y": 0.074
}
],
"causal_graph": {
"nodes": [
"treatment",
"health",
"base_health",
"age"
],
"edges": [
{
"from": "treatment",
"to": "health",
"type": "hypothesized"
},
{
"from": "base_health",
"to": "treatment",
"type": "confounding"
},
{
"from": "base_health",
"to": "health",
"type": "confounding"
},
{
"from": "age",
"to": "treatment",
"type": "confounding"
},
{
"from": "age",
"to": "health",
"type": "confounding"
}
],
"backdoor_paths": [
"treatment <- base_health -> health",
"treatment <- age -> health"
]
},
"identification": {
"strategy": "Backdoor Adjustment",
"adjustment_set": [
"age",
"base_health"
],
"reasoning": "发现 2 条后门路径。通过控制变量 ['age', 'base_health'] 可阻断所有后门路径,满足后门准则。"
},
"estimation": {
"ATE_Outcome_Regression": 0.2444,
"ATE_IPW": 0.2425,
"ATE_reported": 0.2435,
"95%_CI": [
0.2356,
0.254
],
"interpretation": "在控制 ['age', 'base_health'] 后,接受处理使 health 平均变化 0.2435 (95%CI: 0.24-0.25)。",
"overlap_assumption": "满足",
"robustness": "稳健"
},
"log_path": "examples/medical_v2/log.md"
}