Skip to content

Commit

Permalink
[INLONG-895][Doc] Improve HTTP report documentation (#896)
Browse files Browse the repository at this point in the history
  • Loading branch information
gosonzhang authored Nov 30, 2023
1 parent 248cdd0 commit c6079e5
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 0 deletions.
16 changes: 16 additions & 0 deletions docs/sdk/dataproxy-sdk/http.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,25 @@ title: HTTP Report
sidebar_position: 3
---

## Introduction to the HTTP Reporting Process
InLong processes HTTP report messages through DataProxy nodes:the reporting source periodically obtains the access point list from the Manager, and then selects available HTTP reporting nodes from the access point list based on its own strategy, after that uses the HTTP protocol for data production. The overall HTTP reporting process is illustrated in the following diagram:

![](img/http_report.png)

- Heartbeat reporting: DataProxy periodically reports heartbeats to the Manager, providing information about the enabled access points, including {IP, Port, Protocol, Load}.
- Online node caching: The Manager caches the heartbeat information reported by DataProxy, sensing the available access nodes in the cluster and the available reporting access information.
- Access point acquisition: The HTTP SDK (either an HttpProxySender implemented by DataProxy-SDK or an HTTP reporting SDK developed according to the HTTP reporting protocol) periodically obtains the available reporting access point list information for the current groupId by calling the "/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}" method from the Manager.
- Access point selection: The HTTP SDK selects the DataProxy node for message reporting based on the reporting node selection strategy.
- Data reporting: The HTTP SDK constructs the reporting message according to the HTTP reporting protocol, sends the request message to the selected DataProxy node, and performs actions such as resending or exception output based on the response result after receiving the response.
- Data acceptance: DataProxy checks the HTTP message. If the message is successfully accepted, it returns a success response and forwards the message to the MQ cluster. If the message format or value does not meet the specifications, or if the message processing fails, DataProxy returns a failure response with the corresponding error code and detailed error information.

Suggestion:
Due to the issues of low performance, low proportion of valid data, and the ease of losing request messages in HTTP reporting, it is recommended for businesses to prioritize using the TCP method for data reporting.

## Create real-time synchronization task
Create a task on the Dashboard or through the command line, and use `Auto Push` (autonomous push) as the data source type.


## Method 1: Call the interface to report (CURL)
```bash
curl -X POST -d 'groupId=give_your_group_id&streamId=give_your_stream_id&dt=data_time&body=give_your_data_body&cnt=1' http://dataproxy_url:46802/dataproxy/message
Expand Down
Binary file added docs/sdk/dataproxy-sdk/img/http_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,26 @@ title: HTTP 上报
sidebar_position: 3
---

## HTTP 上报流程介绍
InLong 通过 DataProxy 节点处理 HTTP 上报消息,上报源定期从 Manager 获取接入点列表,然后根据自身策略从接入点列表里选择可用的 HTTP 上报节点,再采用 HTTP 协议进行数据生产。总的 HTTP 上报流程如下图示:

![](img/http_report.png)

- 心跳上报:DataProxy 定期上报心跳至 Manager,提供该节点已启用接入的 {IP,Port,Protocol,Load} 信息;

- 在线节点缓存:Manager 缓存 DataProxy 上报的心跳信息,感知集群里可用的接入节点,以及可用的上报接入信息;

- 接入点获取:HTTP SDK(数据上报源采用 DataProxy-SDK 实现的 HttpProxySender,或者据 HTTP 上报协议自行开发的 HTTP 上报 SDK)定期通过“/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}”方法从 Manager 获取当前上报的groupId对应的可用上报接入点列表信息;

- 接入点选取:HTTP SDK 根据上报节点选取策略,选择待进行消息上报的 DataProxy 节点;

- 数据上报:HTTP SDK 根据 HTTP 上报协议构造上报消息,向选中的 DataProxy 节点发送请求消息,并在收到响应后根据响应结果做是否重发、异常输出等操作;

- 数据接纳:DataProxy 检查 HTTP 消息,如果成功接纳则返回成功响应,并将消息转发给 MQ 集群;如果消息格式或者数值不符合规范,或者消息处理失败,则 DataProxy 返回失败响应,响应里携带对应的错误码和详细的错误信息。

建议:
由于 HTTP 上报存在性能低、有效数据占比低、请求消息容易丢失等问题,建议业务尽量用 TCP 方式进行数据上报。

## 新建实时同步任务
在 Dashboard 或者通过命令行工具创建任务,数据源类型使用 `Auto Push` (自主推送)。

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c6079e5

Please sign in to comment.