Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipv6 addresses that end in :: get truncated to only one : #796

Open
samJ-bitsight opened this issue Aug 30, 2024 · 15 comments
Open

Ipv6 addresses that end in :: get truncated to only one : #796

samJ-bitsight opened this issue Aug 30, 2024 · 15 comments

Comments

@samJ-bitsight
Copy link

If an address I input ends with :: (example 1:2::) masscan will output the ip as 1:2: which is invalid.

@mzpqnxow

This comment has been minimized.

@samJ-bitsight
Copy link
Author

We are scraping various inputs of ipv6 addresses and all of them seem to output addresses ending in ::
I don't necessarily know if 1:2:: is valid, but we will see more complicated addresses ending in :: as well, this is just an example

I mean the classic loopback ipv6 address is 1::

@mzpqnxow

This comment has been minimized.

@mzpqnxow
Copy link
Contributor

mzpqnxow commented Sep 7, 2024

Are CIDR blocks of that form handled correctly? For example, 2001:10::/28

By the way, I'm not opposed to fixing the behavior if it's incorrect. I would actually like to help with that, but I'm not completely certain that that style of address is valid (I'm not certain it's invalid either, I was hoping someone that works with ipv6 regularly enough would come along and tell one of us why we're wrong...)

Another question, what do applications like ping/ping6, nmap, socat, etc. think about the example address (similar ones)

I would be surprised if the implementation in masscan wasn't taken from a sound implementation. But I do find myself often surprised, I guess

@mzpqnxow
Copy link
Contributor

mzpqnxow commented Sep 7, 2024

Can you give an example of the exact command you're using? So I can reproduce it and look into it?

@mzpqnxow

This comment has been minimized.

@mzpqnxow

This comment has been minimized.

@samJ-bitsight
Copy link
Author

Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs

open tcp port 1: 1725889082
where instead we would want it to say
open tcp port 1:: 1725889082

@samJ-bitsight
Copy link
Author

We aren't using cidrs to scan the ipv6 space because it's so large. We have a list of single addresses that have been scraped from various locations, like dns resolutions, and we minify them to help save a little on space. The current work around we have is just when we pull from the masscan output we just check if the ip address ends in : we append the missing one

@mzpqnxow
Copy link
Contributor

Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs

open tcp port 1: 1725889082

where instead we would want it to say

open tcp port 1:: 1725889082

Hey! so I am using master, the problem is in the output once a valid address is found. if the found address ended with :: it outputs

open tcp port 1: 1725889082

where instead we would want it to say

open tcp port 1:: 1725889082

Ohhhh... well, I feel very silly then. I assumed the issue was in parsing the target list

I think what you're seeing should be an easy fix since it should be a simple issue in one (or all) of the relatively trivial output modules. I'm happy to take a quick stab at a fix and ask

A few things that you can help with:

1. Do you see that output on the command line output only, or do you see it with the output formats?

Quickest way to check:

masscan -oB out.bin <your usual scan params>

^-- exclude whatever -o option you normally use, only use -oB to capture to the compact/lossless binary format

Then you can do:

for x in D J X G; do
  masscan --readscan out.bin -o$x out.fmt.$x
done

Check each to see of the issue manifests

Separately- it sounds like you may be doing this as part of an automated workflow- which is also my use case. If you're interested in tracking the status of the scan as it runs based on the console output, I submitted #564 (--ndjson-status) some years ago, which makes all the console stdout/stderr output line-buffered NDJSON. It beats the heck out of having regex to parse the status, if that's what you're currently doing. Aside from "FYI, it may be helpful to you", I'm also curious if the issue manifests there as well. That's something you would have to manually eyeball in a real scan (or I suppose you could 2>scan.stderr and grep)

I would check these things myself but as I mentioned I don't have any ipv6 capable kernels running (meaning I can't even assign a dummy address to an interface, I would have to first reboot) and given that, I can't think of a really quick way to fake a SYN|ACK response that doesn't involve writing code that received and transmits via raw sockets

Let me know what you see from those other output formats and then I'll have an easier time of finding the problematic code

Thanks

@samJ-bitsight
Copy link
Author

Hey! Yes I am sadly running this in an automated system where we are doing a form of regexing the output
luckily we can pretty much just listen for any line that doesn't start with # to get all valid data, but I'll look into the ndjson-status and see if that helps us more!

@mzpqnxow

This comment has been minimized.

@mzpqnxow
Copy link
Contributor

mzpqnxow commented Sep 15, 2024

static void
_append_ipv6(stream_t *out, const unsigned char *ipv6)
{
    static const char hex[] = "0123456789abcdef";
    size_t i;
    int is_ellision = 0;

    /* An IPv6 address is printed as a series of 2-byte hex words
     * separated by colons :, for a total of 16-bytes */
    for (i = 0; i < 16; i += 2) {
        unsigned n = ipv6[i] << 8 | ipv6[i + 1];

        /* Handle the ellision case. A series of words with a value
         * of 0 can be removed completely, replaced by an extra colon */
        if (n == 0 && !is_ellision) {
            is_ellision = 1;
            while (i < 13 && ipv6[i + 2] == 0 && ipv6[i + 3] == 0)
                i += 2;
            _append_char(out, ':');

            /* test for all-zero address, in which case the output
             * will be "::". */
            while (i == 14 && ipv6[i] == 0 && ipv6[i + 1] == 0){
                i=16;
                _append_char(out, ':');
            }
            continue;
        }

        /* Print the colon between numbers. Fence-post alert: only colons
         * between numbers are printed, not at the beginning or end of the
         * string */
        if (i)
            _append_char(out, ':');

        /* Print the digits. Leading zeroes are not printed */
        if (n >> 12)
            _append_char(out, hex[(n >> 12) & 0xF]);
        if (n >> 8)
            _append_char(out, hex[(n >> 8) & 0xF]);
        if (n >> 4)
            _append_char(out, hex[(n >> 4) & 0xF]);
        _append_char(out, hex[(n >> 0) & 0xF]);
    }
}

Now, let's see what other applications do - this isn't exactly an unusual operation (and there was reference up the call stack to borrowing code from another project)

@mzpqnxow
Copy link
Contributor

mzpqnxow commented Sep 15, 2024

tcpdump

https://github.com/the-tcpdump-group/tcpdump/blob/72095a100d236aa35efedd68d9297b65523cc989/addrtostr.c#L97

It does seem that the implementation in tcpdump has a case explicitly for trailing zeros

https://github.com/the-tcpdump-group/tcpdump/blob/72095a100d236aa35efedd68d9297b65523cc989/addrtostr.c#L196

This should be relatively simple to fix

@mzpqnxow
Copy link
Contributor

mzpqnxow commented Sep 19, 2024

@samJ-bitsight can you be more explicit with regard to how exactly you're invoking masscan, and when/where exactly you're seeing the problematic output?

  • With what flags (ideally, the exact command-line arguments)
  • Where are you seeing that output (e.g. the terminal, the output of -oG, -oD, ...)
  • Have you looked seriously into the possibility that something in your pipeline (between masscan and the output you're seeing) is the culprit?

Asking that third question because I added a full set of tests for the address formatting functions in #801 and included several of the addresses you provided as problematic, but they all passed. None of the inputs produced what you're seeing

Perhaps I'm missing some one-off case where the address is not formatted by that common function, which would explain why the test cases didn't bring it to light. If you help me to understand exactly how/when/where you're seeing that output, I can try to see if it's somehow going around it

mzpqnxow added a commit to mzpqnxow/masscan that referenced this issue Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants