LLM バッチ処理で 100 以上のエージェント設定を生成する方法

はじめに

ソーシャルメディアシミュレーションのために数百ものAIエージェントを構成するのは、気が遠くなるような作業に聞こえるかもしれません。各エージェントには、アクティビティスケジュール、投稿頻度、応答遅延、影響力ウェイト、およびスタンスが必要です。これを手動で行うと、何時間もかかってしまいます。

MiroFishは、LLMを活用した構成生成によってこれを自動化します。このシステムは、ドキュメント、ナレッジグラフ、およびシミュレーション要件を分析し、各エージェントの詳細な構成を生成します。

課題：LLMは失敗することがあります。出力が途中で切れたり、JSONが壊れたり、トークン制限に引っかかったりします。

このガイドでは、完全な実装について説明します。

ステップバイステップの生成（時間 → イベント → エージェント → プラットフォーム）
コンテキスト制限を回避するためのバッチ処理
切り詰められた出力のためのJSON修復戦略
LLMが失敗した場合のルールベースのフォールバック構成
タイプ別エージェントアクティビティパターン（学生 vs 公式 vs メディア）
検証および修正ロジック

💡

構成生成パイプラインは、一連のAPI呼び出しを通じて100以上のエージェントを処理します。Apidogは、各段階でのリクエスト/レスポンススキーマの検証、本番環境に到達する前のJSON形式エラーの検出、およびLLM出力の切り詰めのようなエッジケースシナリオのテストケース生成に使用されました。

ボタン

すべてのコードは、MiroFishのプロダクション環境での使用に由来しています。

アーキテクチャの概要

構成ジェネレータはパイプラインアプローチを使用します。

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   コンテキスト      │ ──► │   時間設定        │ ──► │   イベント設定      │
│   ビルダー        │     │   ジェネレータ    │     │   ジェネレータ    │
│                 │     │                 │     │                 │
│ - シミュレーション    │     │ - 総時間        │     │ - 初期投稿        │
│   要件            │     │ - ラウンドあたりの分数 │     │ - ホットトピック    │
│ - エンティティ概要    │     │ - ピーク時間      │     │ - 物語の方向性    │
│ - ドキュメントテキスト │     │ - 活動乗数      │     │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                                        │
                                                        ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   最終設定        │ ◄── │   プラットフォーム    │ ◄── │   エージェント設定    │
│   アセンブリ      │     │   設定            │     │   バッチ          │
│                 │     │                 │     │                 │
│ - すべてマージ    │     │ - Twitterパラメータ│     │ - バッチあたり15エージェント│
│ - 検証            │     │ - Redditパラメータ │     │ - Nバッチ         │
│ - JSONを保存      │     │ - バイラルしきい値│    │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘

ファイル構造

backend/app/services/
├── simulation_config_generator.py  # 主要な設定生成ロジック
├── ontology_generator.py           # オントロジー生成（共有）
└── zep_entity_reader.py            # エンティティフィルタリング

backend/app/models/
├── task.py                         # タスクトラッキング
└── project.py                      # プロジェクト状態

ステップバイステップの生成戦略

すべての構成を一度に生成すると、トークン制限を超えてしまいます。代わりに、システムは段階的に生成します。

class SimulationConfigGenerator:
    # 各バッチは15エージェントの構成を生成します
    AGENTS_PER_BATCH = 15

    # コンテキスト制限
    MAX_CONTEXT_LENGTH = 50000
    TIME_CONFIG_CONTEXT_LENGTH = 10000
    EVENT_CONFIG_CONTEXT_LENGTH = 8000
    ENTITY_SUMMARY_LENGTH = 300
    AGENT_SUMMARY_LENGTH = 300
    ENTITIES_PER_TYPE_DISPLAY = 20

    def generate_config(
        self,
        simulation_id: str,
        project_id: str,
        graph_id: str,
        simulation_requirement: str,
        document_text: str,
        entities: List[EntityNode],
        enable_twitter: bool = True,
        enable_reddit: bool = True,
        progress_callback: Optional[Callable[[int, int, str], None]] = None,
    ) -> SimulationParameters:

        # 総ステップ数を計算
        num_batches = math.ceil(len(entities) / self.AGENTS_PER_BATCH)
        total_steps = 3 + num_batches  # 時間 + イベント + N エージェントバッチ + プラットフォーム
        current_step = 0

        def report_progress(step: int, message: str):
            nonlocal current_step
            current_step = step
            if progress_callback:
                progress_callback(step, total_steps, message)
            logger.info(f"[{step}/{total_steps}] {message}")

        # コンテキストを構築
        context = self._build_context(
            simulation_requirement=simulation_requirement,
            document_text=document_text,
            entities=entities
        )

        reasoning_parts = []

        # ステップ1: 時間設定を生成
        report_progress(1, "時間設定を生成中...")
        time_config_result = self._generate_time_config(context, len(entities))
        time_config = self._parse_time_config(time_config_result, len(entities))
        reasoning_parts.append(f"時間設定: {time_config_result.get('reasoning', '成功')}")

        # ステップ2: イベント設定を生成
        report_progress(2, "イベント設定とホットトピックを生成中...")
        event_config_result = self._generate_event_config(context, simulation_requirement, entities)
        event_config = self._parse_event_config(event_config_result)
        reasoning_parts.append(f"イベント設定: {event_config_result.get('reasoning', '成功')}")

        # ステップ3-N: エージェント設定をバッチで生成
        all_agent_configs = []
        for batch_idx in range(num_batches):
            start_idx = batch_idx * self.AGENTS_PER_BATCH
            end_idx = min(start_idx + self.AGENTS_PER_BATCH, len(entities))
            batch_entities = entities[start_idx:end_idx]

            report_progress(
                3 + batch_idx,
                f"エージェント設定を生成中 ({start_idx + 1}-{end_idx}/{len(entities)})..."
            )

            batch_configs = self._generate_agent_configs_batch(
                context=context,
                entities=batch_entities,
                start_idx=start_idx,
                simulation_requirement=simulation_requirement
            )
            all_agent_configs.extend(batch_configs)

        reasoning_parts.append(f"エージェント設定: {len(all_agent_configs)}のエージェントを生成")

        # 初期投稿の投稿者を割り当てる
        event_config = self._assign_initial_post_agents(event_config, all_agent_configs)

        # 最終ステップ: プラットフォーム設定
        report_progress(total_steps, "プラットフォーム設定を生成中...")
        twitter_config = PlatformConfig(platform="twitter", ...) if enable_twitter else None
        reddit_config = PlatformConfig(platform="reddit", ...) if enable_reddit else None

        # 最終構成を組み立てる
        params = SimulationParameters(
            simulation_id=simulation_id,
            project_id=project_id,
            graph_id=graph_id,
            simulation_requirement=simulation_requirement,
            time_config=time_config,
            agent_configs=all_agent_configs,
            event_config=event_config,
            twitter_config=twitter_config,
            reddit_config=reddit_config,
            generation_reasoning=" | ".join(reasoning_parts)
        )

        return params

この段階的なアプローチは以下の通りです。

各LLM呼び出しを集中させ、管理しやすくする
ユーザーに進捗状況の更新を提供する
ある段階で失敗した場合に部分的な回復を可能にする

コンテキストの構築

コンテキストビルダーは、トークン制限を尊重しながら関連情報を収集します。

def _build_context(
    self,
    simulation_requirement: str,
    document_text: str,
    entities: List[EntityNode]
) -> str:

    # エンティティの要約
    entity_summary = self._summarize_entities(entities)

    context_parts = [
        f"## シミュレーション要件\n{simulation_requirement}",
        f"\n## エンティティ情報 ({len(entities)}エンティティ)\n{entity_summary}",
    ]

    # スペースが許せばドキュメントテキストを追加
    current_length = sum(len(p) for p in context_parts)
    remaining_length = self.MAX_CONTEXT_LENGTH - current_length - 500  # 500文字のバッファ

    if remaining_length > 0 and document_text:
        doc_text = document_text[:remaining_length]
        if len(document_text) > remaining_length:
            doc_text += "\n...(ドキュメントが切り詰められました)"
        context_parts.append(f"\n## 元のドキュメント\n{doc_text}")

    return "\n".join(context_parts)

エンティティの要約

エンティティはタイプ別に要約されます。

def _summarize_entities(self, entities: List[EntityNode]) -> str:
    lines = []

    # タイプ別にグループ化
    by_type: Dict[str, List[EntityNode]] = {}
    for e in entities:
        t = e.get_entity_type() or "不明"
        if t not in by_type:
            by_type[t] = []
        by_type[t].append(e)

    for entity_type, type_entities in by_type.items():
        lines.append(f"\n### {entity_type} ({len(type_entities)}エンティティ)")

        # 限られた要約長で限られた数を表示
        display_count = self.ENTITIES_PER_TYPE_DISPLAY
        summary_len = self.ENTITY_SUMMARY_LENGTH

        for e in type_entities[:display_count]:
            summary_preview = (e.summary[:summary_len] + "...") if len(e.summary) > summary_len else e.summary
            lines.append(f"- {e.name}: {summary_preview}")

        if len(type_entities) > display_count:
            lines.append(f"  ...その他 {len(type_entities) - display_count}件")

    return "\n".join(lines)

これは次のような出力を生成します。

### 学生 (45エンティティ)
- Zhang Wei: 学生組合で活動し、キャンパスのイベントや学業のプレッシャーについて頻繁に投稿...
- Li Ming: AI倫理を研究している大学院生で、テクノロジーニュースをよく共有...
...その他 43件

### 大学 (3エンティティ)
- Wuhan University: 公式アカウント、お知らせやニュースを投稿...

時間設定の生成

時間設定は、シミュレーション期間とアクティビティパターンを決定します。

def _generate_time_config(self, context: str, num_entities: int) -> Dict[str, Any]:
    # この特定のステップのためにコンテキストを切り詰める
    context_truncated = context[:self.TIME_CONFIG_CONTEXT_LENGTH]

    # 許可される最大値（エージェント数の90%）を計算
    max_agents_allowed = max(1, int(num_entities * 0.9))

    prompt = f"""以下のシミュレーション要件に基づいて、時間設定を生成します。

{context_truncated}

## タスク
時間設定JSONを生成します。

### 基本原則（イベントタイプと参加者グループに基づいて調整）：
- ユーザー層は中国人であり、北京のタイムゾーンの習慣に従う必要があります
- 午前0-5時: ほぼ活動なし（係数0.05）
- 午前6-8時: 徐々に活動開始（係数0.4）
- 午前9-午後6時: 勤務時間、中程度の活動（係数0.7）
- 午後7-10時: 夕方のピーク、最も活動的（係数1.5）
- 午後11時: 活動が減少（係数0.5）

### JSON形式で返します（マークダウンなし）：

例:
{{
    "total_simulation_hours": 72,
    "minutes_per_round": 60,
    "agents_per_hour_min": 5,
    "agents_per_hour_max": 50,
    "peak_hours": [19, 20, 21, 22],
    "off_peak_hours": [0, 1, 2, 3, 4, 5],
    "morning_hours": [6, 7, 8],
    "work_hours": [9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
    "reasoning": "時間設定の説明"
}}

フィールドの説明：
- total_simulation_hours (int): 24-168時間、速報ニュースの場合は短く、継続的なトピックの場合は長く
- minutes_per_round (int): 30-120分、60分を推奨
- agents_per_hour_min (int): 範囲 1-{max_agents_allowed}
- agents_per_hour_max (int): 範囲 1-{max_agents_allowed}
- peak_hours (int array): 参加者グループに基づいて調整
- off_peak_hours (int array): 通常は深夜/早朝
- morning_hours (int array): 午前中の時間帯
- work_hours (int array): 勤務時間
- reasoning (string): 簡単な説明"""

    system_prompt = "あなたはソーシャルメディアシミュレーションの専門家です。純粋なJSON形式で返してください。"

    try:
        return self._call_llm_with_retry(prompt, system_prompt)
    except Exception as e:
        logger.warning(f"時間設定LLM生成失敗: {e}、デフォルトを使用します")
        return self._get_default_time_config(num_entities)

時間設定の解析と検証

def _parse_time_config(self, result: Dict[str, Any], num_entities: int) -> TimeSimulationConfig:
    # 生の値を取得
    agents_per_hour_min = result.get("agents_per_hour_min", max(1, num_entities // 15))
    agents_per_hour_max = result.get("agents_per_hour_max", max(5, num_entities // 5))

    # 検証と修正: 総エージェント数を超えないようにする
    if agents_per_hour_min > num_entities:
        logger.warning(f"agents_per_hour_min ({agents_per_hour_min}) が総エージェント数 ({num_entities}) を超えています。修正済み")
        agents_per_hour_min = max(1, num_entities // 10)

    if agents_per_hour_max > num_entities:
        logger.warning(f"agents_per_hour_max ({agents_per_hour_max}) が総エージェント数 ({num_entities}) を超えています。修正済み")
        agents_per_hour_max = max(agents_per_hour_min + 1, num_entities // 2)

    # 最小値 < 最大値であることを確認
    if agents_per_hour_min >= agents_per_hour_max:
        agents_per_hour_min = max(1, agents_per_hour_max // 2)
        logger.warning(f"agents_per_hour_min >= max です。{agents_per_hour_min}に修正済み")

    return TimeSimulationConfig(
        total_simulation_hours=result.get("total_simulation_hours", 72),
        minutes_per_round=result.get("minutes_per_round", 60),
        agents_per_hour_min=agents_per_hour_min,
        agents_per_hour_max=agents_per_hour_max,
        peak_hours=result.get("peak_hours", [19, 20, 21, 22]),
        off_peak_hours=result.get("off_peak_hours", [0, 1, 2, 3, 4, 5]),
        off_peak_activity_multiplier=0.05,
        morning_activity_multiplier=0.4,
        work_activity_multiplier=0.7,
        peak_activity_multiplier=1.5
    )

デフォルトの時間設定（中国タイムゾーン）

def _get_default_time_config(self, num_entities: int) -> Dict[str, Any]:
    return {
        "total_simulation_hours": 72,
        "minutes_per_round": 60,  # 1ラウンドあたり1時間
        "agents_per_hour_min": max(1, num_entities // 15),
        "agents_per_hour_max": max(5, num_entities // 5),
        "peak_hours": [19, 20, 21, 22],
        "off_peak_hours": [0, 1, 2, 3, 4, 5],
        "morning_hours": [6, 7, 8],
        "work_hours": [9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
        "reasoning": "デフォルトの中国タイムゾーン設定を使用しています"
    }

イベント設定の生成

イベント設定は、初期投稿、ホットトピック、および物語の方向性を定義します。

def _generate_event_config(
    self,
    context: str,
    simulation_requirement: str,
    entities: List[EntityNode]
) -> Dict[str, Any]:

    # LLM参照のために利用可能なエンティティタイプを取得
    entity_types_available = list(set(
        e.get_entity_type() or "不明" for e in entities
    ))

    # タイプごとの例を表示
    type_examples = {}
    for e in entities:
        etype = e.get_entity_type() or "不明"
        if etype not in type_examples:
            type_examples[etype] = []
        if len(type_examples[etype]) < 3:
            type_examples[etype].append(e.name)

    type_info = "\n".join([
        f"- {t}: {', '.join(examples)}"
        for t, examples in type_examples.items()
    ])

    context_truncated = context[:self.EVENT_CONFIG_CONTEXT_LENGTH]

    prompt = f"""以下のシミュレーション要件に基づいて、イベント設定を生成します。

シミュレーション要件: {simulation_requirement}

{context_truncated}

## 利用可能なエンティティタイプと例
{type_info}

## タスク
イベント設定JSONを生成します。
- ホットトピックキーワードを抽出
- 物語の方向性を記述
- 初期投稿を設計します。**各投稿はposter_typeを指定する必要があります**

**重要**: poster_typeは上記の「利用可能なエンティティタイプ」から選択する必要があります。これにより、初期投稿を適切なエージェントに割り当てることができます。

例えば、公式声明はOfficial/Universityタイプによって、ニュースはMediaOutletによって、学生の意見はStudentによって投稿されるべきです。

JSON形式で返します（マークダウンなし）：
{{
    "hot_topics": ["キーワード1", "キーワード2", ...],
    "narrative_direction": "<物語の方向性の説明>",
    "initial_posts": [
        {{"content": "投稿内容", "poster_type": "エンティティタイプ（利用可能なタイプと一致する必要がある）"}},
        ...
    ],
    "reasoning": "<簡単な説明>"
}}"""

    system_prompt = "あなたは意見分析の専門家です。純粋なJSON形式で返してください。"

    try:
        return self._call_llm_with_retry(prompt, system_prompt)
    except Exception as e:
        logger.warning(f"イベント設定LLM生成失敗: {e}、デフォルトを使用します")
        return {
            "hot_topics": [],
            "narrative_direction": "",
            "initial_posts": [],
            "reasoning": "デフォルト設定を使用中"
        }

初期投稿の投稿者の割り当て

初期投稿を生成した後、それらを実際のエージェントにマッチングします。

def _assign_initial_post_agents(
    self,
    event_config: EventConfig,
    agent_configs: List[AgentActivityConfig]
) -> EventConfig:

    if not event_config.initial_posts:
        return event_config

    # タイプ別にエージェントをインデックス化
    agents_by_type: Dict[str, List[AgentActivityConfig]] = {}
    for agent in agent_configs:
        etype = agent.entity_type.lower()
        if etype not in agents_by_type:
            agents_by_type[etype] = []
        agents_by_type[etype].append(agent)

    # タイプエイリアスマッピング（LLMのバリエーションに対応）
    type_aliases = {
        "official": ["official", "university", "governmentagency", "government"],
        "university": ["university", "official"],
        "mediaoutlet": ["mediaoutlet", "media"],
        "student": ["student", "person"],
        "professor": ["professor", "expert", "teacher"],
        "alumni": ["alumni", "person"],
        "organization": ["organization", "ngo", "company", "group"],
        "person": ["person", "student", "alumni"],
    }

    # 同じエージェントを再利用しないように使用済みインデックスを追跡
    used_indices: Dict[str, int] = {}

    updated_posts = []
    for post in event_config.initial_posts:
        poster_type = post.get("poster_type", "").lower()
        content = post.get("content", "")

        matched_agent_id = None

        # 1. 直接マッチ
        if poster_type in agents_by_type:
            agents = agents_by_type[poster_type]
            idx = used_indices.get(poster_type, 0) % len(agents)
            matched_agent_id = agents[idx].agent_id
            used_indices[poster_type] = idx + 1
        else:
            # 2. エイリアスマッチ
            for alias_key, aliases in type_aliases.items():
                if poster_type in aliases or alias_key == poster_type:
                    for alias in aliases:
                        if alias in agents_by_type:
                            agents = agents_by_type[alias]
                            idx = used_indices.get(alias, 0) % len(agents)
                            matched_agent_id = agents[idx].agent_id
                            used_indices[alias] = idx + 1
                            break
                    if matched_agent_id is not None:
                        break

        # 3. フォールバック: 最も影響力の高いエージェントを使用
        if matched_agent_id is None:
            logger.warning(f"タイプ'{poster_type}'に一致するエージェントが見つかりませんでした。最も影響力の高いエージェントを使用します")
            if agent_configs:
                sorted_agents = sorted(agent_configs, key=lambda a: a.influence_weight, reverse=True)
                matched_agent_id = sorted_agents[0].agent_id
            else:
                matched_agent_id = 0

        updated_posts.append({
            "content": content,
            "poster_type": post.get("poster_type", "不明"),
            "poster_agent_id": matched_agent_id
        })

        logger.info(f"初期投稿の割り当て: poster_type='{poster_type}' -> agent_id={matched_agent_id}")

    event_config.initial_posts = updated_posts
    return event_config

バッチエージェント設定の生成

数百のエージェントの構成を一度に生成するとトークン制限を超えてしまいます。システムは15個ずつバッチで処理します。

def _generate_agent_configs_batch(
    self,
    context: str,
    entities: List[EntityNode],
    start_idx: int,
    simulation_requirement: str
) -> List[AgentActivityConfig]:

    # 限られた要約長でエンティティ情報を構築
    entity_list = []
    summary_len = self.AGENT_SUMMARY_LENGTH
    for i, e in enumerate(entities):
        entity_list.append({
            "agent_id": start_idx + i,
            "entity_name": e.name,
            "entity_type": e.get_entity_type() or "不明",
            "summary": e.summary[:summary_len] if e.summary else ""
        })

    prompt = f"""以下の情報に基づいて、各エンティティのソーシャルメディア活動設定を生成します。

シミュレーション要件: {simulation_requirement}

## エンティティリスト
```json
{json.dumps(entity_list, ensure_ascii=False, indent=2)}

タスク

各エンティティのアクティビティ設定を生成します。注意：

**時間は中国の習慣に従う必要があります**：午前0-5時はほぼ活動なし、午後7-10時は最も活動的
**公式機関**（大学/政府機関）：低活動（0.1-0.3）、勤務時間（9-17時）、遅い応答（60-240分）、高い影響力（2.5-3.0）
**メディア**（メディアアウトレット）：中程度の活動（0.4-0.6）、終日活動（8-23時）、速い応答（5-30分）、高い影響力（2.0-2.5）
**個人**（学生/個人/卒業生）：高活動（0.6-0.9）、主に夕方（18-23時）、速い応答（1-15分）、低い影響力（0.8-1.2）
**著名人/専門家**：中程度の活動（0.4-0.6）、中〜高の影響力（1.5-2.0）

system_prompt = "あなたはソーシャルメディア行動分析の専門家です。純粋なJSON形式で返してください。"

try:
    result = self._call_llm_with_retry(prompt, system_prompt)
    llm_configs = {cfg["agent_id"]: cfg for cfg in result.get("agent_configs", [])}
except Exception as e:
    logger.warning(f"エージェント設定バッチLLM生成失敗: {e}、ルールベースの生成を使用します")
    llm_configs = {}

# AgentActivityConfigオブジェクトを構築
configs = []
for i, entity in enumerate(entities):
    agent_id = start_idx + i
    cfg = llm_configs.get(agent_id, {})

    # LLMが失敗した場合はルールベースのフォールバックを使用
    if not cfg:
        cfg = self._generate_agent_config_by_rule(entity)

    config = AgentActivityConfig(
        agent_id=agent_id,
        entity_uuid=entity.uuid,
        entity_name=entity.name,
        entity_type=entity.get_entity_type() or "不明",
        activity_level=cfg.get("activity_level", 0.5),
        posts_per_hour=cfg.get("posts_per_hour", 0.5),
        comments_per_hour=cfg.get("comments_per_hour", 1.0),
        active_hours=cfg.get("active_hours", list(range(9, 23))),
        response_delay_min=cfg.get("response_delay_min", 5),
        response_delay_max=cfg.get("response_delay_max", 60),
        sentiment_bias=cfg.get("sentiment_bias", 0.0),
        stance=cfg.get("stance", "neutral"),
        influence_weight=cfg.get("influence_weight", 1.0)
    )
    configs.append(config)

return configs

ルールベースのフォールバック設定

LLMが失敗した場合、事前定義されたパターンを使用します。

def _generate_agent_config_by_rule(self, entity: EntityNode) -> Dict[str, Any]:
    entity_type = (entity.get_entity_type() or "不明").lower()

    if entity_type in ["university", "governmentagency", "ngo"]:
        # 公式機関: 勤務時間、低頻度、高影響力
        return {
            "activity_level": 0.2,
            "posts_per_hour": 0.1,
            "comments_per_hour": 0.05,
            "active_hours": list(range(9, 18)),  # 9:00-17:59
            "response_delay_min": 60,
            "response_delay_max": 240,
            "sentiment_bias": 0.0,
            "stance": "neutral",
            "influence_weight": 3.0
        }

    elif entity_type in ["mediaoutlet"]:
        # メディア: 終日活動、中頻度、高影響力
        return {
            "activity_level": 0.5,
            "posts_per_hour": 0.8,
            "comments_per_hour": 0.3,
            "active_hours": list(range(7, 24)),  # 7:00-23:59
            "response_delay_min": 5,
            "response_delay_max": 30,
            "sentiment_bias": 0.0,
            "stance": "observer",
            "influence_weight": 2.5
        }

    elif entity_type in ["professor", "expert", "official"]:
        # 専門家/教授: 勤務時間 + 夕方、中頻度
        return {
            "activity_level": 0.4,
            "posts_per_hour": 0.3,
            "comments_per_hour": 0.5,
            "active_hours": list(range(8, 22)),  # 8:00-21:59
            "response_delay_min": 15,
            "response_delay_max": 90,
            "sentiment_bias": 0.0,
            "stance": "neutral",
            "influence_weight": 2.0
        }

    elif entity_type in ["student"]:
        # 学生: 夕方のピーク、高頻度
        return {
            "activity_level": 0.8,
            "posts_per_hour": 0.6,
            "comments_per_hour": 1.5,
            "active_hours": [8, 9, 10, 11, 12, 13, 18, 19, 20, 21, 22, 23],
            "response_delay_min": 1,
            "response_delay_max": 15,
            "sentiment_bias": 0.0,
            "stance": "neutral",
            "influence_weight": 0.8
        }

    elif entity_type in ["alumni"]:
        # 卒業生: 夕方中心
        return {
            "activity_level": 0.6,
            "posts_per_hour": 0.4,
            "comments_per_hour": 0.8,
            "active_hours": [12, 13, 19, 20, 21, 22, 23],  # ランチ + 夕方
            "response_delay_min": 5,
            "response_delay_max": 30,
            "sentiment_bias": 0.0,
            "stance": "neutral",
            "influence_weight": 1.0
        }

    else:
        # デフォルトの個人: 夕方のピーク
        return {
            "activity_level": 0.7,
            "posts_per_hour": 0.5,
            "comments_per_hour": 1.2,
            "active_hours": [9, 10, 11, 12, 13, 18, 19, 20, 21, 22, 23],
            "response_delay_min": 2,
            "response_delay_max": 20,
            "sentiment_bias": 0.0,
            "stance": "neutral",
            "influence_weight": 1.0
        }

リトライとJSON修復を伴うLLM呼び出し

LLM呼び出しは失敗することがあります。出力が途中で切れたり、JSONが壊れたりします。システムはこれらすべてを処理します。

def _call_llm_with_retry(self, prompt: str, system_prompt: str) -> Dict[str, Any]:
    import re

    max_attempts = 3
    last_error = None

    for attempt in range(max_attempts):
        try:
            response = self.client.chat.completions.create(
                model=self.model_name,
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": prompt}
                ],
                response_format={"type": "json_object"},
                temperature=0.7 - (attempt * 0.1)  # リトライ時に温度を下げる
            )

            content = response.choices[0].message.content
            finish_reason = response.choices[0].finish_reason

            # 切り詰められているか確認
            if finish_reason == 'length':
                logger.warning(f"LLM出力が切り詰められました (試行 {attempt+1})")
                content = self._fix_truncated_json(content)

            # JSONの解析を試みる
            try:
                return json.loads(content)
            except json.JSONDecodeError as e:
                logger.warning(f"JSON解析失敗 (試行 {attempt+1}): {str(e)[:80]}")

                # JSONの修復を試みる
                fixed = self._try_fix_config_json(content)
                if fixed:
                    return fixed

                last_error = e

        except Exception as e:
            logger.warning(f"LLM呼び出し失敗 (試行 {attempt+1}): {str(e)[:80]}")
            last_error = e
            import time
            time.sleep(2 * (attempt + 1))

    raise last_error or Exception("LLM呼び出し失敗")

切り詰められたJSONの修正

def _fix_truncated_json(self, content: str) -> str:
    content = content.strip()

    # 閉じられていない括弧を数える
    open_braces = content.count('{') - content.count('}')
    open_brackets = content.count('[') - content.count(']')

    # 閉じられていない文字列をチェック
    if content and content[-1] not in '",}]':
        content += '"'

    # 括弧を閉じる
    content += ']' * open_brackets
    content += '}' * open_braces

    return content

高度なJSON修復

def _try_fix_config_json(self, content: str) -> Optional[Dict[str, Any]]:
    import re

    # 切り詰めを修正
    content = self._fix_truncated_json(content)

    # JSON部分を抽出
    json_match = re.search(r'\{[\s\S]*\}', content)
    if json_match:
        json_str = json_match.group()

        # 文字列内の改行を削除
        def fix_string(match):
            s = match.group(0)
            s = s.replace('\n', ' ').replace('\r', ' ')
            s = re.sub(r'\s+', ' ', s)
            return s

        json_str = re.sub(r'"[^"\\]*(?:\\.[^"\\]*)*"', fix_string, json_str)

        try:
            return json.loads(json_str)
        except:
            # 制御文字の削除を試みる
            json_str = re.sub(r'[\x00-\x1f\x7f-\x9f]', ' ', json_str)
            json_str = re.sub(r'\s+', ' ', json_str)
            try:
                return json.loads(json_str)
            except:
                pass

    return None

設定データ構造

エージェントアクティビティ設定

@dataclass
class AgentActivityConfig:
    """単一エージェントのアクティビティ設定"""
    agent_id: int
    entity_uuid: str
    entity_name: str
    entity_type: str

    # アクティビティレベル (0.0-1.0)
    activity_level: float = 0.5

    # 投稿頻度 (1時間あたり)
    posts_per_hour: float = 1.0
    comments_per_hour: float = 2.0

    # 活動時間 (24時間形式, 0-23)
    active_hours: List[int] = field(default_factory=lambda: list(range(8, 23)))

    # 応答速度 (シミュレーション上の分単位での反応遅延)
    response_delay_min: int = 5
    response_delay_max: int = 60

    # 感情傾向 (-1.0から1.0, 負から正へ)
    sentiment_bias: float = 0.0

    # 特定のトピックに対するスタンス
    stance: str = "neutral"  # 支持、反対、中立、観察者

    # 影響力ウェイト (視認される確率に影響)
    influence_weight: float = 1.0

時間シミュレーション設定

@dataclass
class TimeSimulationConfig:
    """時間シミュレーション設定（中国タイムゾーン）"""
    total_simulation_hours: int = 72  # デフォルト72時間（3日間）
    minutes_per_round: int = 60  # 1ラウンドあたり60分

    # 1時間あたりに起動されるエージェント数
    agents_per_hour_min: int = 5
    agents_per_hour_max: int = 20

    # ピーク時間 (夕方19-22時, 中国人が最も活動的)
    peak_hours: List[int] = field(default_factory=lambda: [19, 20, 21, 22])
    peak_activity_multiplier: float = 1.5

    # オフピーク時間 (早朝0-5時, ほとんど活動なし)
    off_peak_hours: List[int] = field(default_factory=lambda: [0, 1, 2, 3, 4, 5])
    off_peak_activity_multiplier: float = 0.05

    # 午前中の時間
    morning_hours: List[int] = field(default_factory=lambda: [6, 7, 8])
    morning_activity_multiplier: float = 0.4

    # 勤務時間
    work_hours: List[int] = field(default_factory=lambda: [9, 10, 11, 12, 13, 14, 15, 16, 17, 18])
    work_activity_multiplier: float = 0.7

完全なシミュレーションパラメータ

@dataclass
class SimulationParameters:
    """完全なシミュレーションパラメータ構成"""
    simulation_id: str
    project_id: str
    graph_id: str
    simulation_requirement: str

    time_config: TimeSimulationConfig = field(default_factory=TimeSimulationConfig)
    agent_configs: List[AgentActivityConfig] = field(default_factory=list)
    event_config: EventConfig = field(default_factory=EventConfig)
    twitter_config: Optional[PlatformConfig] = None
    reddit_config: Optional[PlatformConfig] = None

    llm_model: str = ""
    llm_base_url: str = ""

    generated_at: str = field(default_factory=lambda: datetime.now().isoformat())
    generation_reasoning: str = ""

    def to_dict(self) -> Dict[str, Any]:
        time_dict = asdict(self.time_config)
        return {
            "simulation_id": self.simulation_id,
            "project_id": self.project_id,
            "graph_id": self.graph_id,
            "simulation_requirement": self.simulation_requirement,
            "time_config": time_dict,
            "agent_configs": [asdict(a) for a in self.agent_configs],
            "event_config": asdict(self.event_config),
            "twitter_config": asdict(self.twitter_config) if self.twitter_config else None,
            "reddit_config": asdict(self.reddit_config) if self.reddit_config else None,
            "llm_model": self.llm_model,
            "llm_base_url": self.llm_base_url,
            "generated_at": self.generated_at,
            "generation_reasoning": self.generation_reasoning,
        }

要約表：エージェントタイプ別パターン

エージェントタイプ	活動レベル	活動時間	投稿数/時間	コメント数/時間	応答時間 (分)	影響力
大学	0.2	9-17	0.1	0.05	60-240	3.0
政府機関	0.2	9-17	0.1	0.05	60-240	3.0
メディアアウトレット	0.5	7-23	0.8	0.3	5-30	2.5
教授	0.4	8-21	0.3	0.5	15-90	2.0
学生	0.8	8-12, 18-23	0.6	1.5	1-15	0.8
卒業生	0.6	12-13, 19-23	0.4	0.8	5-30	1.0
個人 (デフォルト)	0.7	9-13, 18-23	0.5	1.2	2-20	1.0

結論

LLMを活用した構成生成には、以下の慎重な取り扱いが必要です。

**ステップバイステップの生成**：管理しやすい段階（時間 → イベント → エージェント → プラットフォーム）に分割する
**バッチ処理**：コンテキスト制限を回避するために、バッチあたり15エージェントを処理する
**JSON修復**：括弧のマッチングと文字列のエスケープで切り詰めを処理する
**ルールベースのフォールバック**：LLMが失敗した場合に適切なデフォルトを提供する
**タイプ固有のパターン**：異なるエージェントタイプには異なる活動パターンがある
**検証と修正**：生成された値をチェックし、問題（例：`agents_per_hour`が`total_agents`を超えるなど）を修正する

ボタン