Communications Mining 用户指南

上次更新日期 2025年11月10日

批量上传

重要提示：

可计费操作

对于每个创建的注释或每个更新的注释（基于其唯一 ID），如果其文本已被修改，您需支付 1 个AI Unit或 0.2 个Platform Units费用。

CLI 允许您批量上传注释（包括预批注的注释）。除了在不需要实时连接的情况下将数据导入 Communications Mining™ 外，它还可用于将预先存在的训练数据上传到 Communications Mining，或覆盖 Communications Mining 中的现有注释或标签。

注意：本节假定您已经配置CLI。

准备数据

CLI 需要 JSONL 格式的数据（也称为以换行符分隔的 JSON），其中每一行都是一个 JSON 值。许多工具都可以导出 JSONL 文件。如果您有任何问题，请联系支持团队。

JSONL 文件中的每一行代表一个注释对象。每个注释对象都应至少具有唯一 ID、时间戳和一段文本，但可以具有其他字段，例如元数据。要了解要为数据设置哪些字段，请查看注释参考资料。

JSONL 文件中的每一行均应采用以下格式（仅显示必填字段）。（请注意，为便于阅读，显示时缩进，但应全部占到文件中的一行。）

{
  "comment": {
    "id": "<unique id>",
    "timestamp": "<timestamp>",
    "messages": [
      {
        "body": {
          "text": "<text of the comment>"
        }
      }
    ]
  }
}{
  "comment": {
    "id": "<unique id>",
    "timestamp": "<timestamp>",
    "messages": [
      {
        "body": {
          "text": "<text of the comment>"
        }
      }
    ]
  }
}

如果要上传注释的标签，可以像这样添加标签（与前面提到的相同，为便于阅读，显示缩进，但应全部占文件中的一行）：

{
  "comment": {
    "id": "<unique id>",
    "timestamp": "<timestamp>",
    "messages": [
      {
        "body": {
          "text": "<text of the comment>"
        }
      }
    ]
  },
  "labelling": {
    "assigned": [
      {
        "name": "<Your Label Name>",
        "sentiment": "<positive|negative>"
      },
      {
        "name": "<Another Label Name>",
        "sentiment": "<positive|negative>"
      }
    ]
  }
}{
  "comment": {
    "id": "<unique id>",
    "timestamp": "<timestamp>",
    "messages": [
      {
        "body": {
          "text": "<text of the comment>"
        }
      }
    ]
  },
  "labelling": {
    "assigned": [
      {
        "name": "<Your Label Name>",
        "sentiment": "<positive|negative>"
      },
      {
        "name": "<Another Label Name>",
        "sentiment": "<positive|negative>"
      }
    ]
  }
}

上传数据

上传注释

以下命令可将注释上传到指定的来源。我们建议将注释上传到新的空来源，因为如果出现问题，这样可以更轻松地回滚 - 您只需删除源即可。

re create comments \
  --source <project_name/source_name> \
  --file <file_name.jsonl>
re create comments \
  --source <project_name/source_name> \
  --file <file_name.jsonl>

如果要更新现有注释，则应指定--overwrite标志。系统将覆盖comment.id字段中的注释。我们建议您在更新注释之前备份源的副本，以便能够在出现问题时恢复原始注释。

上传带有标签的注释

如果您想与注释一起上传标签，则应指定要将标签上传到的数据集。在开始上传之前，应将数据集连接到源。

re create comments \
  --source <project_name/source_name> \
  --dataset <project_name/dataset_name> \
  --file <file_name.jsonl>
re create comments \
  --source <project_name/source_name> \
  --dataset <project_name/dataset_name> \
  --file <file_name.jsonl>