[afnog] A heads up on a nasty IPv6 bug

Jan Zorz zorz at isoc.org
Mon Aug 15 10:09:50 UTC 2016


On 15/08/16 11:51, Mukom Akong T. wrote:
> Very succulent description. It still doesn't explain why host behind CPE
> gets two different default gateways as Andrew is reporting. 

It probably doesn't, I believe that was a typo... Andrew is probably a
bit tired from monitoring a deployment and analyzing/debuging the
possible problems, so typos like this happens.

I'm not aware of any auto-configuration mechanism that would cause
default route to point towards a global IPv6 address on L3 interface.

Cheers, Jan

> Unless in reality, between provisioning the various LAN-side /64s (or
> some other yet to be identified event), the link local address of the
> LAN interface changes. 
> 
> Static Prefix will fix the problem of reachability due to an no-longer
> valid prefix. The different default gateways issue will still be
> present. Right?
> 
> 
> 
> 
> _____________________________
> From: Jan Zorz <zorz at isoc.org <mailto:zorz at isoc.org>>
> Sent: Monday, August 15, 2016 8:35 AM
> Subject: Re: [afnog] A heads up on a nasty IPv6 bug
> To: Andrew Alston <andrew.alston at liquidtelecom.com
> <mailto:andrew.alston at liquidtelecom.com>>, Mukom Akong T.
> <mukom.tamon at gmail.com <mailto:mukom.tamon at gmail.com>>
> Cc: <afnog at afnog.org <mailto:afnog at afnog.org>>
> 
> 
> Hey,
> 
> Yes, this dynamic way of assigning IPv6 PDs is causing much trouble to
> operators around the world until they decide to swallow the initial pain
> and change to static PD assignments... ;)
> 
> Please see my comments inline...
> 
> On 14/08/16 16:40, Andrew Alston wrote:
>> If you are automatically allocating usernames for PPPoE authentication
>> its relatively simple to tie that username provisioning to static
>> assignments.
>>
>> As an example, if your username ends in a numeric on your auto
>> provisioning system its relatively simple to use some basic maths and
>> hex conversion to produce a static subnet that’s tied.
> 
> Yes, usually some math formula is created to tie the PD and username and
> then the script populates additional field in your radius database and
> after that - the user always gets the same IPv6 PD "for life".
> 
> If you have multiple aggregation or termination points then some
> observation is needed prior to this so you can group users on same
> termination points to have PD from same aggregated prefix, but this is
> trivial.
> 
>> With regards to the subnet issue on the RA, I’ll respond to that later,
>> though perhaps Jan would also like to make some comments on this, since
>> his understanding of it is admittedly better than mine until I do more
>> testing and labbing
> 
> Ok, let's un-dust the old saying:
> 
> "In theory there is no difference between theory and practice. In
> practice there is."
> 
> For sake of simplicity, let's say we have only 3 components:
> 
> +--------+ wan +---------+ lan +-----------+
> | ISP |-----------| CPE |------------| host |
> +--------+ +---------+ +-----------+
> 
> ISP can be any access equipment you are using, for example BRAS or
> anything else.
> 
> CPE has for simplicity just WAN access and LAN segment behind the CPE
> for home network.
> 
> host is any device that we use on our home network - it can be computer,
> laptop, tablet, mobile phone, printer - anything that connects to our
> network and can autoconfigure IPv6.
> 
> Theory:
> - CPE connects to ISP and gets the Prefix Delegation (PD) from ISP.
> - ISP installs a route for that PD segment towards the CPE wan interface
> - CPE provisions /64 out of PD to LAN interface and starts sending out
> RA (Router Advertisements) packets with prefix information to LAN
> - host connects to LAN network and sends out na RS (Router Solicitation)
> packet that is responded by RA packet containing prefix information.
> - host accepts the packet, generates IPv6 address(es), does the DAD
> process and if all good - sets up the IPv6 addresses and sets the
> default route to source IPv6 address of RA message - that is a
> link-local address of a CPE LAN interface.
> - now IPv6 traffic can start flowing.
> - ISP decides that PD must change, or something is wrong with wan link
> and the PD assignment process restarts (for example pppoe client restarts)
> - in this event CPE gets a different PD from ISP and need to delete the
> old IPv6 address from LAN port that is no longer in assigned PD.
> - ISP installs a route for that PD segment towards the CPE wan interface
> and removes the route for old PD towards that CPE
> - CPE adds a new IPv6 address from new /64 from new PD to LAN, deletes
> the old IPv6 address and sends to LAN link the RA packet with old prefix
> information with lifetime 0
> - ISP removes the route
> - all hosts that receives RA packet with lifetime 0 must remove the old
> IPv6 address and stop using it.
> - now we have CPE with new IPv6 PD and all hosts on LAN link with just
> IPv6 addresses from new PD and world is beautiful and nice and a safe place.
> 
> This was the theory, now let's see some practice and real world - what
> can go (and will) go wrong?
> 
> We have 2 failure modes here:
> - host never receives RA packet with lifetime 0 and ends up with IPv6
> addresses from old and new IPv6 PD
> - host receives RA packet with lifetime 0, but doesn't care much because
> it's implemented wrong (and this is happening, given the wide variety of
> end user devices that are in use on this earth)
> 
> In both cases we end up with a device that has the option of using the
> wrong IPv6 address as a source address for sent packet.
> 
> Source address selection mechanism is broken in this case and there is
> an ongoing discussion at IETF how to fix that, but there will be some
> time before any fix becomes standard and even more time before it's
> implemented.
> 
> Currently hosts selects the source IPv6 address for packets in quite
> variety of ways - some randomly, some "address that was last allocated",
> some of them "first one until it's valid", and all other possible ways,
> depending on OS and vendor.
> 
> Problem is, that after changing the PD - ISP removed the route back to
> CPE for that old PD and some of those ISPs that went the "less optimal"
> path down the IPv6 road and started with dynamic PD assignments did a
> quick fix to put the old PD in "quarantene" and keep the old route back
> to CPE for additional 24 hours so all old IPv6 addresses in LAN segment
> behind the CPE expires and vanishes. This is a quick hack so you get a
> quazi functional access network and can dedicate your time to start
> planing a process to make PD assignments properly static (also over BRAS
> reboots).
> 
> So, this is a pain that many of operators went through and the solution
> is quite obvious: go with static IPv6 PD assignments. Your help desk
> will appreciate you.
> 
> Again: "In theory there is no difference between theory and practice. In
> practice there is."
> 
> I hope I shed some light on the issue with the above explanation.
> 
> See you all in Mauritius for Afrinic-25 where we can discuss IPv6
> deployment challenges at length ;)
> 
> Cheers and thnx, Jan Zorz
> 
> -- 
> Jan Zorz
> Internet Society
> mailto:<zorz at isoc.org <mailto:zorz at isoc.org>>
> http://www.internetsociety.org/deploy360/
> --
> "Engineering is always positive in results..." N. Tesla
> 
> 
> 
> 
> _______________________________________________
> afnog mailing list
> https://www.afnog.org/mailman/listinfo/afnog
> 


-- 
Jan Zorz
Internet Society
mailto:<zorz at isoc.org>
http://www.internetsociety.org/deploy360/
--
"Engineering is always positive in results..." N. Tesla



More information about the afnog mailing list