Sunday, July 3, 2011

Bye WNR3500L -- Hello WNDR3700

Back in March of 2010 I posted comments about my experiences with the Netgear WNR3500L wireless router.  At the time, I had some very positive things to say, but this update is the result of some greater wisdom I have obtained through experience since then.  What I was trying to do is move my IPv6 tunnel and iptables/ip6tables firewall to the WNR3500L and use it as a router as well as a wireless access point and Ethernet switch (I have my own way of setting up a firewall thankyouverymuch, and don't really feel like adapting to what others force me to do).

The Real WNR3500L Story

Netgear markets the WNR3500L wireless router as an "Open Source Router" and implies that the hardware can be customized as desired (see this article at "my open router" (www.myopenrouter.com).  This is not entirely true because neither Netgear nor Broadcom (the manufacturer of the chipset and reference design) have provided sufficient documentation or source code for the hardware.  In the case of the switching chip (BCM53115S), the only information available online is a marketing announcement, and nobody has been successful at obtaining technical information directly from Broadcom.  Having the proper configuration of this chip is essential to the correct operation of the router, particularly if it is to be used for anything greater than the software Netgear provides.

After literally hours of going through Google and any Broadcom/Netgear source code I could find (and stomach), I discovered several things:
  1. Broadcom effectively keeps most of the key device drivers for their wireless processor and switch chips as closed-source, despite what is claimed by Netgear (it isn't an "open source router" if the source isn't available).
  2. The person who wrote these drivers thought they were being clever by implementing a nvram-based configuration "registry" and then integrating these with the device drivers (who in their right mind puts userland type functionality into device drivers?!).
  3. Everyone I have seen who tried to do what I did came to the same conclusion:  that you could only get so far without having to do some serious reverse-engineering of the Broadcom drivers.
  4. In addition to point #2, Broadcom also embedded the same registry access in the CFE bootloader for the WNR3500L, and if you screw around with that registry too much, you can effectively "brick" your router.  No, the CFE sources for the WNR3500L aren't available.
It looks like Broadcom hired some students to write the embedded code for this router, since it has all the makings of an inexperienced programmer with clever ideas.  Don't get me wrong, I was one of those people at one time too, and maybe still do the same thing from time-to-time, but I know something that doesn't inter-operate with other code well and is implemented haphazardly when I see it, and this is it.

I can no longer recommend this router unless you're 100% happy with the original Netgear firmware.

The only reason that Netgear can still call this an "Open Source Router" is that they do release the Open Source code that was used in the software design of the router.  Looking at the code they release, it seems apparent to me that Netgear has attempted to keep their end of the bargain, but only provides binary stubs for much of the hardware drivers and utilities due to them being proprietary to Broadcom and other entities. 

What about DD-WRT?

Have you ever looked at the DD-WRT source code?  No, seriously, have you?  If you have any question about what I am about to say, do it and come back here and then read this...

While DD-WRT does boot and kind of work on the WNR3500L, it doesn't work well.  To those who have put the effort into this project, I feel your pain, and I appreciate the hard work you did, but bottom line is that getting something to work and getting to work well are two different things.  Every single time I tried to use DD-WRT on the WNR3500L the switching functions seemed to work poorly (it would drop packets randomly on my MythTV systems so video would randomly pixelate).  This didn't happen on the factory firmware from Netgear.

To make a long story short, the people at DD-WRT basically appeared to take the mantra, "If you can't beat 'em, then join 'em," and proceeded to work around the weird registry-based tangle of userland/kernel code by just accepting and using it too, since they had no other choice but use many of the Broadcom binary-only device drivers.  However, the problem is that Broadcom's do-it-all userland router service ("acos") pokes some funky values into the switching chip before and after the driver is loaded and I don't believe that DD-WRT is doing that right.  What are they not doing right?  Hell if I know.  I couldn't follow the DD-WRT source code for my life.  I couldn't tell what parts were appropriate for the WNR3500L and what was for other routers...which were binary-only (proprietary) and which were things I could look at and change.  There were pieces of code clearly marked as Broadcom proprietary that I'm not really sure was current for the router.  It is a bloody mess.  No real offense to the DD-WRT folks, because what they've done overall is pretty impressive, but that source code organization requires some serious drugs to understand!  Since I'm staying away from those, that pretty much put an end to my idea of fixing DD-WRT on this platform.

By this point, I decided that the only thing the WNR3500L was good for was exactly how it was sold, with the original Netgear firmware.  If that's what you want, it's a nice router.  If it isn't, then you'll do what I do, and give it to someone else who has a need for a router like this.

Discovering OpenWrt

During my frustration with DD-WRT and trying to decipher the ungodly source code that Netgear provided, I started looking seriously again at OpenWrt.  OpenWrt on the surface is another open source firmware alternative for common wireless routers.  My first experiences with it were not too good, and so I kind of wrote them off and forgot about it.  However, my second look uncovered a real gold mine of technical wizardry and a lot of people who really seemed to understand what they were doing.  Now granted, they and I may disagree on some implementation issues, but the difference between OpenWrt and DD-WRT is that the OpenWrt folks basically give you serious tools for customizing things the way you want.  Unlike DD-WRT, the source code has the flavor of the FreeBSD ports system.  It is mostly a well-organized and very logical embedded systems development environment. After looking at OpenWrt for a while, I started to see that this was more than just a tool for improving consumer-grade wireless routers, but could definitely be the basis for other embedded design projects.

One good OpenWrt tool is their wiki.  I don't usually like wikis because they're organized horribly and using "search" comes with the same problems as any other search engine (you end up with hundreds of results and never the one you're looking for).  The OpenWrt wiki is mostly different in that it does have some organization to it, and the answers are mostly there.  Software/hardware developers don't like to write documentation, and it was clear from the OpenWrt docs that this is no exception.  However, it does look like the OpenWrt people are trying, in good faith, to get some stuff documented.  In particular, their supported devices list is extremely comprehensive with photos of the inside of hardware, pros and cons, and everything in between.  I really liked this, and it was where I started my search for a new wireless router.

Enter Netgear WNDR3700

After a lot of thought and looking through the OpenWrt hardware list, I decided to purchase the Netgear WNDR3700 wireless router.  Frankly, I couldn't explain the low-level hardware much better than the OpenWrt people, so you can see that at http://wiki.openwrt.org/toh/netgear/wndr3700.  A higher-level feature list is:
  • Dual-band wireless "N" (2.4 GHz and 5 GHz)  [note that I don't think I have any 5 GHz stuff, but it is nice to have available]
  • 4-port Gigabit Switch (like the WNR3500L)
  • Gigabit WAN port (also like the WNR3500L).  However, the WAN port goes directly to the CPU and not to the switching chip, so it should not be used as a pass-thru bridged port unless you're willing to have the CPU as a choke-point!
  • USB port (for external storage)
  • Chipset is Atheros (CPU/wireless) and Realtek (switch) based -- No more problems with it being Broadcom-proprietary.
  • Around 6W power usage (as measured by my Kill-a-Watt)!  WOW!
The case is kind of hokey in some ways, and the stock antennas are a joke.  However, there are a few modifications that people have documented to replace the antennas.  At the same time I make fun of the hokey antennas, I will also say that (at least for 2.4 GHz) the range is comparable to anything else I've used.  So I'm not sure it's worth complaining about.

Also interesting is that the "stock" Netgear firmware is actually a modified older version of OpenWrt.  The user interface is a typical Netgear-branded interface very typical of their other products, but underneath was OpenWrt.  Cool.  I never really did anything with the original firmware before going to a newer release of OpenWrt.

WNDR3700 Bad Things

Now for the bad and ugly...  The older WNDR3700 had issues with the 2.4 GHz wireless radio that would simply stop working after a while.  The newer WNDR3700 (marked on the side of the box as WNDR3700v2) doesn't have that problem, but also isn't supported by OpenWrt except at the development release.  No big deal, since it seems to work OK.  Except...

There is something wrong with the open source access point software hostapd.  This daemon handles the access point operations, such as radio frequency/channel adjustment, linking/unlinking from clients, and most importantly, handling the WEP/WPA/WPA2 encryption/decryption.  It works mostly, except that my wireless camera no longer works with it.  In debugging mode, after the device is mostly linked-up, I get the message "WPA: received EAPOL-Key 2/2 Group with unexpected replay counter" and I cannot communicate with the camera (even though it appears to be connected).

After hours of work, I think I know what is wrong with hostapd, but am not able to fix it.  Basically, my camera's configuration has a single check-box to enable WPA/WPA2 and a place to enter the pre-shared key (PSK).  When I reconfigured my router to accept WPA only or WPA+WPA2 (I think), the camera linked to the router just fine, but not when I enabled WPA2 only.  It looks like the hostapd software is either interpreting the WPA2 specification too literally and not allowing for some poetic license when it comes to the spec, or...and this is more likely...that the camera is trying WPA and WPA2 kind of at the same time, and confuses hostapd.  The problem is that it works with any other wireless router I could throw at it, so something is clearly buggy with this.  Some people are accusing the Atheros chipset, but I suspect that it is more likely a weird bug in hostapd.  It, too, has source code that is convoluted (to say the least) like just about every piece of software that implements cryptography.  I have been trying to understand what is wrong, and am still doing so.  However, this kind of complicates things a bit since I would really like to have my camera work yet!

Unfortunately, I didn't try the original Netgear software to see if that also had trouble, being based on OpenWrt.

All this being said, I still think that the router was a good purchase so far.  We'll see what happens when I start throwing more complicated stuff at it!

Why Not A Dedicated Router?

I was faced with a dilemma when I started this project:  Should I try to make better use of my wireless router hardware, or just buy a small, low-power computer like the Dreamplug from Globalscale Technologies?  This would have been easier in the short-term because I could have kept the wireless issues and routing issues separate.  However, in the longer term, I didn't like the idea of putting another underutilized power-using (albeit low-power) device on the network, and having to maintain yet another OS/platform.  In addition, the Dreamplug (with console/JTAG adapter) is around $200 with shipping, and I really couldn't justify spending another $200 for a system that would just sit on the network routing/firewalling packets.  On the other hand, my time has some value too.

In the end, I decided on replacing the wireless router because I figured that the long-term issues I had with a separate device outweighed what I could learn by implementing a router on a less-expensive hardware platform.  In the end, I could use these low-cost, low-power routers as cheap embedded controllers for a number of applications (and at work, we have some situations like this).  I will feel much more satisfied with my decision after solving the wireless camera vs. hostapd problem.

Final Words For Broadcom/Netgear

What would be the motivation for Broadcom or Netgear to make hardware information open?  Well, there are a lot of computer enthusiasts out here who are not hardware designers that would love to buy these products, take them apart (so to speak), and make them do things beyond what the average consumer would want.  We want to implement every last function that the device is capable of.  In addition, we're writing the software for free, and making it available for everyone to use.  That's functionality that Broadcom and Netgear can leverage for future product designs or more enhanced firmware features.  Nobody really loses here.

2 comments:

Anonymous said...

Thank you very much for long and detain review of both routers.

Anonymous said...

Great post!