Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INLONG-895][Doc] Improve HTTP report documentation #896

Merged
merged 2 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docs/sdk/dataproxy-sdk/http.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,28 @@ title: HTTP Report
sidebar_position: 3
---

## HTTP report process introduction
InLong processes HTTP report through the DataProxy node. The data source regularly obtains the access point list from the Manager, then selects available HTTP DataProxy nodes from the access point list according to its own policies, and then uses the HTTP protocol for data production; due to the performance of HTTP report due to problems such as low proportion of valid data and easy loss of request messages, it is recommended that users use TCP to report data as much as possible. The overall HTTP report process is shown below:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to the performance of HTTP report due to problems such as low proportion of valid data and easy loss of request messages

Suggestion :
due to the issues of low performance, low percentage of effective data, and easy loss of request messages in HTTP reporting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


![](img/http_report.png)

- Heartbeat report: DataProxy regularly reports heartbeats to the Manager, providing {IP, Port, Protocol, Load} information that the node has enabled access to;

- Online node cache: Manager caches the heartbeat information reported by DataProxy, senses the available access nodes in the cluster, and the available reported access information;

- Access point acquisition: HTTP SDK (the data source adopts HttpProxySender implemented by DataProxy-SDK, or the HTTP report SDK developed by itself according to the HTTP report protocol) regularly through "/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}" The method obtains the available report access point list information corresponding to the currently reported groupId from the Manager;

- Access point selection: HTTP SDK selects the DataProxy node to be reported according to the report node selection strategy;

- Data report: HTTP SDK constructs a message according to the HTTP report protocol, sends a request message to the selected DataProxy node, and after receiving the response, performs operations such as whether to resend and output exceptions based on the response result;

- Data acceptance: DataProxy checks the HTTP message, and returns a success response if it is accepted successfully, and forwards the message to the MQ cluster; if the message format or value does not meet the specifications, or the message processing fails, DataProxy returns a failure response, and the response carries the corresponding Error code and detailed error message.


## Create real-time synchronization task
Create a task on the Dashboard or through the command line, and use `Auto Push` (autonomous push) as the data source type.


## Method 1: Call the interface to report (CURL)
```bash
curl -X POST -d 'groupId=give_your_group_id&streamId=give_your_stream_id&dt=data_time&body=give_your_data_body&cnt=1' http://dataproxy_url:46802/dataproxy/message
Expand Down
Binary file added docs/sdk/dataproxy-sdk/img/http_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,24 @@ title: HTTP 上报
sidebar_position: 3
---

## HTTP 上报流程介绍
InLong 通过 DataProxy 节点处理 HTTP 上报消息,上报源定期从 Manager 获取接入点列表,然后根据自身策略从接入点列表里选择可用的 HTTP 上报节点,再采用 HTTP 协议进行数据生产;由于 HTTP 上报存在性能低、有效数据占比低、请求消息容易丢失等问题,建议业务尽量用 TCP 方式进行数据上报。总的 HTTP 上报流程如下图示:

![](img/http_report.png)

- 心跳上报:DataProxy 定期上报心跳至 Manager,提供该节点已启用接入的 {IP,Port,Protocol,Load} 信息;

- 在线节点缓存:Manager 缓存 DataProxy 上报的心跳信息,感知集群里可用的接入节点,以及可用的上报接入信息;

- 接入点获取:HTTP SDK(数据上报源采用 DataProxy-SDK 实现的 HttpProxySender,或者据 HTTP 上报协议自行开发的 HTTP 上报 SDK)定期通过“/inlong/manager/openapi/dataproxy/getIpList/{inlongGroupId}”方法从 Manager 获取当前上报的groupId对应的可用上报接入点列表信息;

- 接入点选取:HTTP SDK 根据上报节点选取策略,选择待进行消息上报的 DataProxy 节点;

- 数据上报:HTTP SDK 根据 HTTP 上报协议构造上报消息,向选中的 DataProxy 节点发送请求消息,并在收到响应后根据响应结果做是否重发、异常输出等操作;

- 数据接纳:DataProxy 检查 HTTP 消息,如果成功接纳则返回成功响应,并将消息转发给 MQ 集群;如果消息格式或者数值不符合规范,或者消息处理失败,则 DataProxy 返回失败响应,响应里携带对应的错误码和详细的错误信息。


## 新建实时同步任务
在 Dashboard 或者通过命令行工具创建任务,数据源类型使用 `Auto Push` (自主推送)。

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading