Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Sourcegraph Private Connect doc page #755

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 52 additions & 52 deletions docs/cloud/private_connectivity_sourcegraph_connect.mdx
Original file line number Diff line number Diff line change
@@ -1,115 +1,115 @@
# Private Resources on on-prem data center via Sourcegraph Connect agent
# Private Resources in On-Prem Data Centers via Sourcegraph Connect Agent

<Callout type="note">This feature is in the Experimental stage. Please contact Sourcegraph directly via [preferred contact method](https://about.sourcegraph.com/contact) for more information.</Callout>
<Callout type="note">This feature is in the Experimental stage. [Contact us](https://about.sourcegraph.com/contact) for more information.</Callout>

As part of the [Enterprise tier](https://sourcegraph.com/pricing), Sourcegraph Cloud supports connecting private resources on any on-prem private network by running Sourcegraph Connect tunnel agent in customer infrastructure.
As part of the [Enterprise tier](https://sourcegraph.com/pricing), Sourcegraph Cloud supports connecting to resources in the customer's private network by deploying the Sourcegraph Connect tunnel agent in the customer's network.

## How it works

Sourcegraph will set up a tunnel server in a customer dedicated GCP project. Customer will start the tunnel agent provided by Sourcegraph with the provided credential. After start, the agent will authenticate and establish a secure connection with Sourcegraph tunnel server.
Sourcegraph Connect consists of three components:

Sourcegraph Connect consists of three major components:
Tunnel server: a centralized broker between clients and agents, managed by Sourcegraph. It authenticates agents and clients, enforces ACLs, sets up mTLS, and proxies encrypted traffic between clients and agents. Sourcegraph deploys each customer's tunnel server into its own dedicated GCP project.

Tunnel agent: deployed inside the customer network, which uses its own identity and encrypts traffic between the customer code host and client. Agent can only communicate with permitted customer code hosts inside the customer network. Only agents are allowed to establish secure connections with tunnel server, the server can only accept connections if agent identity is approved.
Tunnel agent: deployed inside the customer's network, has its own identity, proxies and encrypts traffic between the code host and tunnel client. The agent can only communicate with permitted private resources inside the customer's network. The customer starts the tunnel agent with the credentials provided by Sourcegraph, then the agent authenticates and establishes a secure connection with the tunnel server. Only agents are allowed to establish secure connections with the tunnel server, and the server only accepts a connection if the agent's identity is approved.

Tunnel server: a centralized broker between client and agent managed by Sourcegraph. Its purpose is to set up mTLS, proxy encrypted traffic between clients and agents and enforce ACL.

Tunnel client: forward proxy clients managed by sourcegraph. Every client has its own identity and it cannot establish a direct connection with the customer agent, and has to go through tunnel server.

[link](https://link.excalidraw.com/readonly/453uvY8infI8wskSecGJ)
Tunnel client: forward proxy clients managed by Sourcegraph, added to the customer's Sourcegraph Cloud instance. Every client has its own identity and it cannot establish a direct connection with the customer agent, and has to go through tunnel server.

<iframe
src="https://link.excalidraw.com/readonly/453uvY8infI8wskSecGJ"
width="100%"
height="100%"
style={{ border: "none" }}
/>
[Diagram link](https://link.excalidraw.com/readonly/453uvY8infI8wskSecGJ)

## Steps

### Initiate the process

Customer should reach out to their account manager to initiate the process. The account manager will work with the customer to collect the required information and initiate the process, including but not limited to:
The customer reaches out to their account manager to initiate the process. The account manager will work with the customer to collect the required information and initiate the process, including but not limited to:

- The DNS name of the private code host, e.g. `gitlab.internal.company.net` or private artifact registry, e.g. `artifactory.internal.company.net`.
- The port of the private code host, e.g., `443`, `80`, `22`.
- The type of TLS certificate used by the private resource: either self-signed by an internal private CA or issued by a public CA.
- The DNS name of the private code host, e.g. `gitlab.internal.company.net` or private artifact registry, e.g. `artifactory.internal.company.net`
- The port of the private code host, e.g., `443`, `80`, `22`
- The type of TLS certificate used by the private resource: self-signed, internal PKI, or issued by a public CA

michaellzc marked this conversation as resolved.
Show resolved Hide resolved
Finally, Sourcegraph will provide the following:

- Instruction to run the agent along with credentials, and endpoint to allowlist egress traffic if needed.
Sourcegraph will provide the instructions and credentials to run the agent, and public IPs and ports of the tunnel server to allowlist egress traffic.

### Create the connection

Customer can follow the provided instructions and install the tunnel agent in the private network. At a high level:
The customer follows the instructions, and installs the agent in their private network. At a high level:

- Permit egress to the internet to a set of static IP addresses and corresponding ports to be provided by Sourcegraph.
- Permit egress to the private resources at the given port.
- Run the tunnel agent binary or docker images with the provided config files and credentials.
- Permit internet egress to the provided endpoint static IPs and ports
- Permit egress to the private resources
- Run the tunnel agent (via Docker container, or binary) with the provided config file and credentials

### Create the code host connection
### Configure the code host connection

Once the connection to private code host is established, the customer can create the [code host connection](/admin/code_hosts/) on their Sourcegraph Cloud instance.
Once the tunnel is established between the agent and server, the customer can configure the [code host connection](/admin/code_hosts/) on their Sourcegraph Cloud instance.

## FAQ

### Why TCP over gRPC?

The tunnel between the client and agent is built using TCP over gRPC. gRPC is a high-performant and battle-tested framework, e.g., built-in support for mTLS for a trusted secure connection. TCP and HTTP/2 are widely supported in the majority of customer environments. Moreover, the simplicity of having a single endpoint for connection between customer environment and their Cloud instance greatly simplifies the work required for customer IT admin. Compared to traditional VPN solutions, such as OpenVPN, IPSec, and SSH over bastion hosts, gRPC allows us to design our own protocol, and the programmable interface allows us to implement advanced features, e.g., fine-grained access control at a per-connection level, audit logging with rich metadata, etc.

### How are connections encrypted? Can anyone else inspect the traffic?

Connections between the tunnel agent inside customer network and a tunnel server inside customer dedicated Sourcegraph GCP VPC use mTLS. Both agents, server and Sourcegraph clients have their own certificates and encrypt/decrypt traffic over TCP. mTLS enforce that both the client and the agent has to have a private key and present valid signed certificate from a trusted CA, which is not shared and this protects from [on-paths and spoofing attacks](https://www.cloudflare.com/en-gb/learning/access-management/what-is-mutual-tls/).
Connections between the tunnel agent inside customer network and a tunnel server inside customer dedicated Sourcegraph GCP VPC use mTLS. Agents, server, and clients all have their own certificates and encrypt / decrypt traffic over TCP. mTLS requires clients and agents to have a private key, and present a valid signed certificate from a trusted CA, which is not shared; this protects from [on-path and spoofing attacks](https://www.cloudflare.com/en-gb/learning/access-management/what-is-mutual-tls/).

### How do you authenticate requests?

Both tunnel clients and agents are assigned an identity corresponding to a GCP Service Account, and they are provided credentials to prove such identity. For tunnel agents, a Service Account key is distributed to the customer. For tunnel clients, it will utilize Workload Identity to prove its identity. They use them to authenticate against tunnel server by sending signed JWT tokens and public key. JWT token contains information about GCP service account credential public key required to validate signature and confirm identity of requestor. The server will then sign the requestor public key and respond with a signed certificate containing GCP Service Account email as a Subject Alternative Name (SAN).
Both tunnel clients and agents are assigned an identity corresponding to a GCP Service Account, and they are provided credentials to prove this identity. Agents use the Service Account key provided to the customer at install time, and clients use Workload Identity to prove their identity. They use these credentials to authenticate against the tunnel server by sending signed JWT tokens and public keys. JWT tokens contain details to specify the GCP Service Account credential public key required to validate signature and confirm the identity of the requestor. The server then signs the requestor's public key and responds with a signed certificate containing the GCP Service Account email as a Subject Alternative Name (SAN).

Finally, if the customer NAT Gateway/Exit Gateway has stable CIDRs, we can provision firewall rules to restrict access to the tunnel server from the provided IP ranges only for an added layer of security.
For an added layer of security, if the customer network's NAT / internet gateway uses public IPs in a stable CIDR range, we can provision firewall rules to restrict access to the tunnel server from the provided IP ranges.

### How do you enforce authorization to restrict what requests can reach the private code host?
### How do you enforce authorization to restrict which requests can reach private resources?

The tunnel server is configured with ACLs. With mTLS every entity in the network has its own identity. The client's identity is used as a source for accessing customer private code hosts, while the agent's identity is used for destination. Tunnel server ensures that only clients with proven identity can communicate with customer tunnel agents.
With mTLS, every entity in the network has its own identity. The tunnel server is configured with ACLs, using the client's identity as the source, and the agent's identity as the destination. Tunnel server ensures that only clients with proven identity can communicate with customer tunnel agents.

### Do you rotate the encryption keys?
### How do you manage keys and certificates?

Encryption keys are short-lived and both tunnel agents and clients have to refresh certificates every 24h. The customer may also manually rotate it by restarting the tunnel agent.
We utilize GCP Certificate Authority Service (CAS), a managed Public Key Infrastructure (PKI) service. It is responsible for the storage of root and intermediate CA signing keys, and the signing of client certificates. Access to GCP CAS is governed by GCP IAM, and only necessary individuals and services can access CAS, with audit trails in GCP Logging.

### How do you manage keys or certificates?
The TLS private keys in the agents and clients only exist in memory, and are never transmitted or shared. Only the public key is sent to the tunnel server, to issue a signed certificate, to establish the mTLS connection.

We utilize GCP Certificate Authority Service (CAS), a managed Public Key Infrastructure (PKI) service. It is responsible for the storage of all signing keys (e.g., root CAs, immediate CAs), and the signing of client certificates. Access to GCP CAS is governed by GCP IAM service and only necessary services or individuals will have access to the service with audit trails in GCP Logging.
### How often do you rotate the encryption keys?

The TLS private key on the tunnel agent or tunnel clients only exist in memory, and are never shared with other parties. Only the public key is sent to the tunnel server to issue a signed certificate to establish mTLS connection.
Encryption keys are short-lived, and both tunnel agents and clients refresh their certificates every 24h. The customer may also manually rotate the agent's certificate by restarting the agent.

### How do you audit access?

Tunnel server will log all critical operations with sufficient metadata to identify the requester to GCP Logging with a default 30-day retention policy. We will also be monitoring unauthorized access events to watch out for potential attackers.
Tunnel server logs operations with sufficient metadata to identify the requester to GCP Logging with a default 30-day retention policy. We also monitor unauthorized access events to watch for potential attacks.

### Why TCP over gRPC?
### What if an attacker gains access to the Sourcegraph Cloud instance?

The tunnel is built using TCP over gRPC. gRPC is a high-performant and battle-tested framework, e.g., built-in support for mTLS for a trusted secure connection. We believe TCP and HTTP/2 are widely supported in majority of environments. Moreover, the simplicity of having a single endpoint for connection between customer environment and their Cloud instance greatly simplifies the work required for customer IT admin. Compared to traditional VPN solutions, such as OpenVPN, IPSec, and SSH over bastion hosts, gRPC allows us to design our own protocol, and the programmable interface allows us to implement advanced features, e.g., fine-grained access control at a per connection level, audit logging with rich metadata.
If an attacker gains access to the Sourcegraph Cloud instance's containers, this would be a security breach, and trigger our Incident Response process. However, we have many controls in place to prevent this from happening where Cloud infrastructure access always requires approval, and the Security team is on-call for unexpected usage patterns. You may learn more from our [Security Portal](https://security.sourcegraph.com/).

### How many agents can a customer start?
Please reach out to us if you have any specific questions regarding our Cloud security posture, we are happy to provide more detail to address your concerns.

To obtain high availability, customers can start multiple tunnel agents. Each of the agents will use the same GCP Service Account credentials, authenticate with the tunnel server and establish connection to it. Tunnel client will randomly select an available agent to forward the traffic.
### How do I need to configure my network for the agent to work?

### How does the customer configure the network to make the agent work?
The tunnel agent needs to connect to the tunnel server. Sourcegraph will provide a dedicated static IP from a customer-dedicated GCP VPC which is used to connect with the tunnel server. The customer must configure network egress to allow TCP (HTTP/2) traffic access to this static IP.

The customer tunnel agent has to authenticate and establish connection with the tunnel server. Sourcegraph will provide a single dedicated static IP from customer dedicated GCP VPC which is used to connect with the tunnel server. Customer has to configure network egress to allow TCP (HTTP/2) traffic access to this static IP.
### How can I restrict access to my private resources?

### How can I restrict access to my private code host connection?
The customer has full control of their network where they deploy the tunnel agent, and can configure, monitor, and terminate the connection at will. We recommend implementing an allowlist to restrict the egress traffic of the agent to the IP addresses provided by Sourcegraph and to the specific private resources your Sourcegraph Cloud instance needs to access, and configuring your firewall to alert you if this ACL is hit. If your code hosts or registries use DNS names, the agent will need access to the DNS server configured on its host.

The customer has full control over the tunnel agent configuration and they can terminate the connection at any time.
What if the attacker gains access to the frontend?
### How can I harden the tunnel agent deployment?

In the event of an attacker gaining access to the Sourcegraph containers, we consider this to be a security breach and we have Incident response processes in place that we will follow. However, we have many controls in place to prevent this from happening where Cloud infrastructure access always requires approval and the Security team is on-call for unexpected usage patterns. You may learn more from our [Security Portal](https://security.sourcegraph.com/).
You can:

Please reach out to us if you have any specific questions regarding our Cloud security posture, and we are happy to provide more detail to address your concerns.
- Deploy the agent on a hardened container platform
- Store the agent credential and config content in a secrets management system and mount these secrets to the container
- Forward the agent's logs to your log management system

### How to harden the tunnel agent deployment?
### How can I inspect the agent's traffic, and audit the data the agent is accessing in my environment?

We recommend using an allowlist to limit the egress traffic of the agent to IP addresses provided by Sourcegraph and specific private resources you would like to permit access. This will prevent the agent to talk to arbitrary services, and reduce the blast radius in the event of a security event.
If a customer needs to inspect and audit traffic, such as performing TLS inspection on the connection between the private resources and Sourcegraph Cloud, we recommend inspecting traffic between the tunnel agent and internal resources, as this traffic uses the protocols and encryption of the internal resources. The tunnel from the agent to the server is encrypted and authenticated by mTLS over gRPC, and uses a custom protocol, so the decrypted payload isn't useable for traffic inspection.

### How can I audit the data Sourcegraph has access to in my environment?
### Can I use Internal PKI or self-signed TLS certificates for my private resources?

The tunnel is secured and authenticated by mTLS over gRPC, and everything is encrypted over transit. If a customer is looking to perform an audit, such as TLS inspection, on the connection between the private resources and Sourcegraph Cloud, we recommend only intercepting and inspecting traffic between the tunnel agent and private resources. The connection between the tunnel agent and Sourcegraph Cloud is using a custom protocol, and the decrypted payload has very little value.
Yes. Please work with your account team to add the public certificate chain of your internal CA under `experimentalFeatures.tls.external.certificates` in your instance's [site configuration](/admin/config/site_config#experimentalFeatures).

### Can I use self-signed TLS certificate for my private resources?
### Is this connection highly available?

Yes. Please work with your account team to add the certificate chain of your internal CA to [site configuration](/admin/config/site_config#experimentalFeatures) at `experimentalFeatures.tls.external.certificates`
To obtain high availability, customers can run multiple tunnel agents across their network. Each agent uses the same GCP Service Account credentials, authenticates with the tunnel server, and establishes their own connection to it. The tunnel client will randomly select an available agent to forward traffic through.
Loading