Skip to content

feat: specify a DERP region for the peer you are trying to connect to#1222

Merged
ramfox merged 17 commits into
mainfrom
ramfox/default_derp_region
Jul 17, 2023
Merged

feat: specify a DERP region for the peer you are trying to connect to#1222
ramfox merged 17 commits into
mainfrom
ramfox/default_derp_region

Conversation

@ramfox

@ramfox ramfox commented Jul 13, 2023

Copy link
Copy Markdown
Member

Description

closes #1221
closes #1073

Notes & open questions

what happens if the peer you want to talk to is in a region you do not understand?

Hi, I've been going back and forth a few times with this. Can I get some feedback.

Now that we have multiple derp regions, it's possible that the regions you have configured are NOT the region that the node you are trying to communicate is in.

So, how should we handle the case where we cannot connect to that specific derp region?

a) When we add a peer to our address book, the derp region is required (along with the PublicKey & ALPNs), and we check our DERP configuration. If we don't have that derp region in our configuration, we fail when we attempt to add that peer to our address book. In this case, we won't ever attempt to dial the peer if no DERP region is provided or the DERP region provided does not exist in our configuration, even if we have a valid UDP connection
b) We do not require a derp region when you add a peer to the address book and we don't check if any derp region provided exists in our configuration. If no UDP address nor DERP region exists when we go to connect, we log a warning, but fail silently. Whatever timeouts exist on QUIC will timeout when no connection can be made. If you don't have logging, you will not understand why this failed.
c) You are not required to provide a DERP region, but if you DO, we will fail if that derp region does not exist in our configuration. The implication here is that if you have provided a DERP region, you expect hole punching to occur, and it is an error if you cannot attempt to hole punch in the current conditions. If you do NOT provide a derp region, you are accepting that hole punching is not possible or you have some outside information that the UDP addresses are accessible.

This last one seems like the best option to me & it's how I am planning to continue with the PR, but would love some feedback.

region 0

Currently, region_id 1 is expected to be NA, and region_id 2 is expected to be EU. This is currently only documented here: https://www.notion.so/number-zero/Derp-Regions-a63c309839e64d9ab19c23778a7b9c37

We also have a bit of a testing problem. Since the region_ids should be standardized, it seems we should leave the id 0 as a testing id.

impl From<Url> for DerpMap

Now that it is we have the possibility of multiple regions, we can no longer assume the default region id for any given Url, so we can no longer implement From on DerpMap for Url

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.

@ramfox ramfox marked this pull request as draft July 13, 2023 03:32
@flub

flub commented Jul 13, 2023

Copy link
Copy Markdown
Contributor

I think I would choose b.

If we don't know the DERP region in our config we might still be able to connect directly to it, it may not even need holepunching at all.

If we do not have any direct address and can not reach the peer's home derp server [0] then we'd have to return either a quinn::ConnectionError::Timeout or quinn::ConnectionError::TransportError (ConnectionError if possible, NO_VIABLE_PATH might be suitable but currently a bit of a stretch since this isn't a real quic path (yet)) at connection time.

[0] Also, I think not knowing the derp server in our own config might be equivalent to the lightly broader "can not reach that derp server" which we also have to handle anyway.

@dignifiedquire

Copy link
Copy Markdown
Contributor

I think we should require at least a derp region or a UDP address, if neither is provided we don't have any chance of connecting and should fail fast.

@ramfox

ramfox commented Jul 13, 2023

Copy link
Copy Markdown
Member Author

I think we should require at least a derp region or a UDP address, if neither is provided we don't have any chance of connecting and should fail fast.

If we do not have any direct address and can not reach the peer's home derp server [0] then we'd have to return either a quinn::ConnectionError::Timeout or quinn::ConnectionError::TransportError (ConnectionError if possible, NO_VIABLE_PATH might be suitable but currently a bit of a stretch since this isn't a real quic path (yet)) at connection time.

Okay, so if we have no UDP addresses, and no derp server OR if that derp server is not one we have in configuration we error. (no UDP && ( no DERP || unknown DERP )) -> error

However, once we attempt to dial in the Conn, if we discover that our connection to the derp server fails us, we don't have a good way of reporting those errors, because all attempts to send packets happen through poll_send, which immediately sends the transmits off to the Actor, which more or less eats the errors. There is likely a good strategy to report these errors back up to poll_send (maybe some ugh mutex locked error field that we check and take each time we poll_send?), but I'd like a longer discussion about the implications of this. My main concern when reporting these connection errors, is that I'm not totally sure how it's supposed to work when we use one magicsock::Conn for multiple connections, or will QUIC take care of that for us?

I'm going to move forward on this PR with the following scenarios in mind:

  • If you do not have UDP addresses and no DERP region, we error
  • If you do not have UDP addresses and the DERP region is unknown to us, we error
  • If you have any UDP address and no DERP region or the DERP region is unknown to us, we log a warning, but we keep the current behavior that exists on main.

@ramfox ramfox force-pushed the ramfox/default_derp_region branch from 1c1cbfd to 8c6c051 Compare July 14, 2023 06:31
@ramfox

ramfox commented Jul 14, 2023

Copy link
Copy Markdown
Member Author

I can't believe I did this, but I added a lock 😢 😢 😢

Now that 0 is a valid derp region (we use it in testing), we need a new way to represent that we are not connected to any derp region. So now our derp region is represented by Option<u16> (and not Atomicu16). And now that we have the possibility of multiple derp regions, or no derp region at all, it seemed like this was a more logical way of describing what is going on.

If there is a strong feeling about this, PLEASE FEEL FREE TO CHANGE IT BACK.

EDIT:
I've removed the lock again. 0 is back to meaning no derp server.

@ramfox ramfox requested a review from dignifiedquire July 14, 2023 06:46
@ramfox ramfox changed the title first pass: my_derp and add_known_addrs feat: specify a DERP region for the peer you are trying to connect to Jul 14, 2023
@ramfox ramfox marked this pull request as ready for review July 14, 2023 06:53
Comment thread iroh-net/src/hp/magicsock/conn.rs Outdated
@ramfox

ramfox commented Jul 14, 2023

Copy link
Copy Markdown
Member Author

Okay, I need to put this down for the night, but the cli tests are WAY more flakey now. Maybe it has something to do with how we now output the derp region to the console?

Otherwise everything is implemented.

You can test your ping to the DERP regions using:
iroh doctor derp-regions

You can explicitly fetch via DERP only by passing in a derp region ID using the --derp-region flag and no UDP address in iroh get

And you can also see that providing a region is essential if you are trying to hole punch by explicitly putting in the "wrong" region when you attempt to iroh get

Comment thread iroh-net/docs/derp_regions.md Outdated
Comment thread iroh-net/src/bin/derper.rs Outdated
Comment thread iroh-net/src/defaults.rs Outdated
/// Hostname of the default Derp.
pub const DEFAULT_DERP_HOSTNAME: &str = "derp.iroh.network";
/// Hostname of the default NA Derp.
pub const DEFAULT_NA_DERP_HOSTNAME: &str = "derp.iroh.network";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Arqu can we get this to be sth like na1. or similar?

Comment thread iroh-net/src/defaults.rs Outdated
Comment thread iroh-net/src/hp/derp/map.rs Outdated
Comment thread iroh-net/src/magic_endpoint.rs Outdated
Comment thread iroh/src/commands/doctor.rs Outdated
@b5

b5 commented Jul 14, 2023

Copy link
Copy Markdown
Member

At @ramfox's request I've run the test suite on windows, and run provide from a windows machine to a mac. While running the test suite I received two "allow this to happen" requests from the windows firewall, which I said yes to. Test suite passed, transfer worked.

ramfox and others added 8 commits July 14, 2023 17:24
Add `my_derp` (so we can display our own home derp) and adjust `add_known_addrs` to include a `region_id` in the `MagicEndpoint`.

Adjusts the `region_id` to be `Option<u16>` in the `hp` code, allowing for `None` if we do not have a derp home region

Allows a region id of `0` to be valid
connects to each region in specified in the configuration and reports either the latency of that region or an error
@ramfox ramfox force-pushed the ramfox/default_derp_region branch from 965a70b to b960db3 Compare July 14, 2023 23:28
@ramfox ramfox force-pushed the ramfox/default_derp_region branch from b960db3 to 6ed4527 Compare July 14, 2023 23:43
@ramfox ramfox force-pushed the ramfox/default_derp_region branch 2 times, most recently from f972366 to 6ed4527 Compare July 15, 2023 01:11
@ramfox ramfox requested a review from dignifiedquire July 17, 2023 03:54
@ramfox ramfox enabled auto-merge (squash) July 17, 2023 03:56
@ramfox ramfox self-assigned this Jul 17, 2023
@ramfox ramfox added NAT Traversal c-iroh-relay Functionality of the iroh-relay server. labels Jul 17, 2023
Comment thread iroh-net/src/magic_endpoint.rs
Comment thread iroh-net/src/hp/derp/http/mesh_clients.rs

@dignifiedquire dignifiedquire left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, except the dialing restrictions, we should figure those out to make sure things still work.

@ramfox ramfox merged commit 456f963 into main Jul 17, 2023
self.outstanding_tasks.hairpin = true;
if !self.hairpin_actor.has_started() {
self.hairpin_actor.start_check(*addr);
self.outstanding_tasks.hairpin = true;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess booleans were too simple to keep track of this state. 😞 I knew I wasn't quite happy with how this state was tracked.

Good call on pushing that state into the actor client. I like it so much I'd go even further and try and get rid of self.outstanding_tasks entirely in favour of tracking the state in the tasks themselves. But that's for another PR.


A peer may be connected to multiple regions, but it will advertise its home region as the one best used to hole-punch or relay packets through.

When attempting to connect to another peer, it is not required to pass in that peer’s DERP region, but if the other peer is behind a NAT, the only way to attempt to hole punch directly through to it is by passing down a DERP region. It is also possible that the DERP region that the remote Iroh peer is connected to is one that you have no dialing information about. However, that is unlikely to happen if you use the default regions and nodes we have specified here. These also exist as defaults in the code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to understand... but I have no suggestions on how to improve it 😅

I think what you're trying to say is that for any direct connection establishment knowing the peer's home derp region is not needed. But if you need holepunching then you need to be aware of the peer's home derp server?

Comment thread iroh-net/docs/derp_regions.md
stun_only: false,
stun_port: 3478,
ipv4: UseIpv4::Some(std::net::Ipv4Addr::new(35, 175, 99, 113)),
ipv6: UseIpv6::TryDns,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we'll notice to update the docs when we change this config format.

(I only do so because I just broke things by making a change to this format)

addr: "[::]:443".parse().unwrap(),
stun_port: DEFAULT_DERP_STUN_PORT,
hostname: DEFAULT_DERP_HOSTNAME.into(),
hostname: NA_DERP_HOSTNAME.into(),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, i read this as "Not Available derp hostname" initially...

/// Probe indicating the presence of port mapping protocols on the LAN.
pub portmap_probe: Option<portmapper::ProbeOutput>,
/// or 0 for unknown
/// `0` for unknown

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also would love to turn this into an option

Comment thread iroh-net/src/hp/netcheck/reportgen/probes.rs
Comment thread iroh-net/src/hp/netcheck/reportgen/probes.rs
Comment thread iroh-net/src/hp/netcheck/reportgen/probes.rs
Comment thread iroh-net/src/magic_endpoint.rs
@ramfox ramfox deleted the ramfox/default_derp_region branch July 18, 2023 15:45
matheus23 pushed a commit that referenced this pull request Nov 14, 2024
…#1222)

* pass down a `derp_region` when we connect to a peer: add `my_derp` (so we can display our own home derp) and adjust `add_known_addrs` to include a `region_id` in the `MagicEndpoint`.

* add `iroh doctor derp-regions` command: connects to each region in specified in the configuration and reports either the latency of that region or an error

* update cli commands to include a `derp_region` where necessary

* add derp region docs

* pr review fixes

* fix: stable sorting for regions

* fix hairpin bug causing probe to never finish

* comment out multiple `test_mesh_network` runs until further investigation on flakeyness

* upgrade yanked dependency


---------

Co-authored-by: dignifiedquire <me@dignifiedquire.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c-iroh-relay Functionality of the iroh-relay server. NAT Traversal

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

feat: integrate the notion of multiple derp regions into the code figure out derp home servers and region ids

5 participants