Monthly Archives: April 2014

Testing ownCloud performance

Update 2014-04-28: Upgrading to ownCloud 7.0.1 has not changed the performance of the platform at all.

Update 2014-04-28: I have found that downloading files is quite fast. At about 10% CPU load, the server can saturate my 10MBit/s internet connection, when I download my files to another computer, over https. When uploading files, top shows mostly waiting time. When downloading, top shows mostly idle time. I suspect the SQL communication/overhead is limiting upload performance, and that ownCloud keeps a lot of book keeping in the database. If it does so for a good reason, and download is reasonably fast, I can live with it. I anyway keep my original article below (on upload performance), but I find the performance quite acceptable for my real-world application, on my not so powerful hardware.

Update 2014-04-26: Tried FastCGI/mod-fcgid, see below.

Ubuntu announced that they will cancel the Ubuntu One service, and Condoleezza Rice will start working for Dropbox. So, how am I going to share my files among different computers and devices?

ownCloud appears like a nice option. It is like Dropbox, but I can run it myself, and it works not only for files, but also for contacts/calenders and smartphones.

Buying ownCloud as a service is possible, but as soon as I want to put my pictures (and perhaps some video and music) it gets pretty expensive. If I host myself several hundreds of GB of disk is no problem.

So, I installed ownCloud (6.0.2) on my QNAP TS 109 running Debian (7.4). Horrible performance – it took a minute to log in. Ok – the QNAP has a 500MHz ARM, but even worse, just 128MB of RAM and quite slow disk access. What device to put ownCloud on? A new nice QNAP (TS-221) is quite pricy, and a Raspberry Pi accesses both disk and network over its USB bus. I came to think of buying a used G4 Mac Mini – they are really cheap! Then I came to think of my old Titanium PowerBook G4 that has been gathering dust last year, and I decided to try running ownCloud on it. Perhaps not as a long term solution, but as a learning/testing machine it could work fine.

ownCloud Server configuration
CPU: G4 866MHz
RAM: 1024Mb
HD: 320GB ATA
OS: Debian, 7.4 (PPC) fresh install on entire hard drive
DB: mysql 5.5 (the std version for Debian)
https: apache 2.2 (the std version for Debian)

To improve performance, I enabled APC for PHP, and disabled full text search.

Performance measurements
For the performance tests, I tried to transfer 1x100MB, 10x10Mb and 100x1Mb files. I measured the times with a regular stopwatch, and occationaly I repeated a test when the result was strange. The below measurements are not exactly accurate, but the big picture is there.

Transfers are made from a Windows 7 PC over a Gbit network.

1x100Mb 10x10Mb 100x1Mb
Encryption and checksum on G4 / server
(1): ssl encrypt aes ; sync 7s
(2): md5sum 1s
File transfer using other methods
(3): ftp/Filezilla 3s 3s 4s
(4): sftp/Filezilla 14s 15s 17s
ownCloud
(5): No SSL, NO APC 15s 32s 234s
(6): No SSL, APC 16s 27s 197s
(7): SSL, APC 34s 43s 263s
(8): SSL, APC, encryption 46s 69s 438s

Comments on Performance
(1): tells me that the server is capable of encrypting 100Mb of data, and sync output to local disk, in 7 seconds. The sync is less than a second.
(2): tells me that the server is capable of processing 100Mb of data in a second.
(3): tells me that simply sending the files over the network with a proven protocol takes 3-4 seconds, slightly slower for more smaller files.
(4): tells me that sending the files in an encrypted stream with a proven protocol over the network takes about 15 seconds, slightly slower for more smaller files.
(5): shows that the overhead for many files in ownCloud is massive.
(6): shows that APC helps, but not in a very significant way.
(7): shows the extra cost of SSL (transferring over a secure channel).
(8): shows the extra cost of encrypting the files for the user, on the server (using openssl/AES, according to ownCloud documentation.

It makes sense to compare row (3) and (6), indicating that with no encryption whatsoever the overhead of ownCloud is 5-50x the actual work. Or, the resources used for actually transferring and storing files are 20%-2%, the rest of the resources, 80%-98% are “wasted”. Now ownCloud has some syncroniziation and error handling capacities not found in FTP, but I dont think that justifies this massive overhead.

In the same way it makes sense to compare row (4) and (7), indicating a waste of 60%-94% of resources, for using a secure channel (and I believe that SSH uses stronger encryption than TLS).

For average file size smaller than 1MB, the waste will be even bigger.

I suspect the cost is related to executing php for each and every file. It could also be the use of the database for each file that is expensive. Somewhere I read that there are “hundreds” of database calls for each web server request handled by ownCloud.

Suggestions
It is of course a bit arrogant to suggest solutions to a problem in an Open Source project that I have never contributed to, without even reading the code. Anyway, here we go:

  • Find a way to upload small directories (<10MB, or <60s transfer) as tarballs or zipfiles. This should of course happen transparantly to the user (and only work via the client, not the web). This way hundreds or thousands of small files could be uploaded in a few seconds instead of very long time - and the load on the server would decrease a lot.
  • Similar to the first suggestion, allow files to be uploaded in fragments, to allow upload of 2GB+ files on all server platforms (it is ridiculus that an ARM server, like a QNAP, can not handle 2GB+ files, as I have read in the documentation is the case).
  • Alternatively, allow ownCloud to use ssh/sftp as transfer protocol. It will not work in all situations, but when client and server are allowed to communicate on port 22, and ownCloud is installed on a server with ssh enabled, it could be an option.

I kind of presume that the problem is one-file-per-request and WebDav limitations. Perhaps it is the database that is the problem? Nevertheless, I think som kind of batch-handling of uploads/downloads is the solution in that case too.

LAMP
ownCloud is built on LAMP, and I doubt the performance problems are related to the LA in LAMP. Also, I dont think that the M should be the problem if the databas calls are kept at a reasonable level. The problem must be with P(HP). I understand and appreciate that PHP is simple and productive, and probably 95% of ownCloud can be perfectly written in PHP. But perhaps there are a few things that should be written in something more high-performing (I am talking about C, of course)?

Conclusion
I really like the ambition of ownCloud, and mostly, the software is very nice. The server has many features, and the clients are not only nice, but also available for several platforms.

ownCloud is a quite mature product, at version 6. I wish some effort is put into improving performance. I believe there are possible strategies that would not require very much rewriting, and not need to brake compability. And I also believe it makes much sense to optimize the ownCloud server code: not only because people may run it on Raspberry Pis, QNAPs or old hardware, but also because it would improve the usefulness on more powerful servers.

2014-04-26: FastCGI / mod-fcgid
I decided to try PHP via FastCGI to see if it could improve performance. Very little difference – I disabled it and got back to “recommended” configuration. For details, read on.

I mostly followed this guide (as apache2-mod-fastcgi seems to be replaced by apache2-mod-fcgid lately, other high-ranking guides were out of date). The following options need to be added to /etc/apache2/apache2.conf:

FcgidFixPathinfo 1               (not in the site definition as suggested in guide)
FcgidMaxRequestLen 536870912     (effectively limits maximum file size)

Broken USB Drive

A friend had probems with a 250GB external Western Digital Passport USB drive. I connected it to Linux, and got:

[ 1038.640149] usb 3-5: new full-speed USB device number 4 using ohci-pci
[ 1038.823970] usb 3-5: device descriptor read/64, error -62
[ 1039.111652] usb 3-5: device descriptor read/64, error -62
[ 1039.391408] usb 3-5: new full-speed USB device number 5 using ohci-pci
[ 1039.575187] usb 3-5: device descriptor read/64, error -62
[ 1039.862954] usb 3-5: device descriptor read/64, error -62
[ 1040.142662] usb 3-5: new full-speed USB device number 6 using ohci-pci
[ 1040.550269] usb 3-5: device not accepting address 6, error -62
[ 1040.726092] usb 3-5: new full-speed USB device number 7 using ohci-pci
[ 1041.133774] usb 3-5: device not accepting address 7, error -62
[ 1041.133806] hub 3-0:1.0: unable to enumerate USB device on port 5

Turned out the USB/SATA-controller was broken, but the drive itself was healthy. I took the 2.5′ SATA-drive out of the enclosure and connected it to another SATA-controller – all seems fine.

Compile program for OpenWRT

I felt a strong desire to compile something, anything, for my WRT54GL running OpenWrt. As is often the case, in the end it is very simple, but finding the simple solution is not very easy. Ironically, the best instructions were not on the OpenWRT site (but I downloaded the Toolchain from openwrt.org).

The program I wanted to compile was a pure C program that I have written myself. Almost clean C89/ANSI code, with a few C99/Posix dependencies. No autoconfigure, no makefile.

Solution/Conclusion
I first downloaded the Toolchain for OpenWRT 12.09, brcm47xx from OpenWRT (despite I run OpenWRT 10.03.1 brcm-2.4). I unpacked it to ~/openwrt/.

Second, two environment variables:

$ PATH=$PWD/OpenWrt-Toolchain-brcm47xx-for-mipsel-gcc-4.6-linaro_uClibc-0.9.33.2/toolchain-mipsel_gcc-4.6-linaro_uClibc-0.9.33.2/bin:$PATH

$ STAGING_DIR=$PWD/OpenWrt-Toolchain-brcm47xx-for-mipsel-gcc-4.6-linaro_uClibc-0.9.33.2/toolchain-mipsel_gcc-4.6-linaro_uClibc-0.9.33.2
$ export STAGING_DIR

Third, compile

$ mipsel-openwrt-linux-uclibc-gcc program.c

Finally, send it to openwrt and test

$ scp a.out root@192.168.0.1:.
$ ssh -l root 192.168.0.1

# ./a.out

That was all I had to do, really. Now some comments on this.

Choosing a compiler/toolchain
OpenWRT gives you three download options, that all seems to be relavant if you want to compile stuff:

The Toolchain is a lot smaller than the others. At least for the brcm platform the SDK is named “for-linux-i486” while the Toolchain is named “for-mipsel”. That confused me because I was not sure the Toolchain was actually a cross compiler.

To confuse things more, as soon as you start reading about how to compile stuff for OpenWRT, everyone talks about the “Buildroot”. No cheating with downloading SDK or Toolchain and get an already compiled compiler! Real men compile their compilers themselves? No disrespect here… the buildroot is fantastic technology, and perhaps I will have my own one day, but right now a C compiler is all I want.

Also, it seems the Toolchain that comes with 10.03.1/brcm-2.4 (my platform) is broken (ok, I am not smart enought to make it work, cc1 complains about unknown parameter). However, in my case the toolchain for 12.09/brcm47xx also worked. I wasted much time with the 10.03.1/brcm-2.4 Toolchain. If my simple steps above dont work quite quickly for you, and you get weird errors from the compiler, download another Toolchain (perhaps even for another platform, and perhaps 12.09/brcm47xx that works for me) just to see if you can get that to work (you may not be able to use the compiled binary, but you can at least confirm that you can generate one).

I suppose it is preferred to use the Toolchain for the OpenWRT platform/version you are actually targeting, but newer toolchains for compatible platforms can also work. Perhaps the newest toolchain is always preferable.

Linking and optimizing
Compiler flags affect the size of your binary, and for OpenWRT you typically want a small binary. I guess the “-Os -s” options to the compiler is the best you can do. The binary itself is static. No dynamic linking. I think that means it only communicates with the rest of the world via Linux system calls, and as long as those have not changed in a non-compatible-way, you can compile your program with a different toolchain than was used to build OpenWRT image and packages (of course, the binary format must be good too).

The C library
What about standard library compliance? My Toolchain came with “uClibc” (although others should be possible). My program uses two things that are not C89/ANSI-C compliant (I know because Visual Studio complained):

1) snprintf(): Worked fine, I believe this is C99 standard.

Update 20140814: The CLOCK_MONOTONIC problem is fixed with BB 14.07
2) clock_gettime(): Compiled without errors or warnings, but did absolutely nothing. The input timespec struct was not modified at all when the function was called. This should be a POSIX function (not Linux specific). I guess it is either not implemented in uClibc, or I should use another clockid_t (I used CLOCK_MONOTONIC), or there is a system call behind it that does not work properly when toolchain is different from the one that build the kernel.

So, generally the compiler and uclibc worked very nicely, but some testing is required.

Build machine
I run the toolchain/cross compiler on a x64 machine running Ubuntu. The toolchain itself seems to be statically linked (ldd tells me), and built for x86 (readelf tells me). So most x86/x64 Linux machines should work just fine, and if you are on BSD you probably know how to run Linux binaries.

Toolchain limitations
For my purposes, the Toolchain was just what I needed. I do not know how to build ipk-package files, and I do not know how to build a complete OpenWRT image. Perhaps the Toolchain is not the right tool for those purposes.

QEMU
If you install QEMU you can test your OpenWRT binary on your x86/x64 Linux machine:

qemu-mipsel -L OpenWrt-Toolchain-brcm47xx-for-mipsel-gcc-4.6-linaro_uClibc-0.9.33.2/toolchain-mipsel_gcc-4.6-linaro_uClibc-0.9.33.2/ a.out

The same way, I guess it should be quite possible to run the Toolchain on a non x86-machine as well. I will write a few lines when I have compiled my OpenWRT/MIPS binaries on my QNAP/ARM running the x86 compiler/toolchain with the QEMU.

IPv6 access with 6to4 OpenWRT Backfire

A little while ago I shared some information on getting IPv6 at home, when all you have is a dynamic (but real/public) IP-address and a good old WRT54GL router with OpenWRT Backfire (brcm-2.4 edition).

I have now stabilized my configuration and I will share some details. You are presumed to

  • be comfortable with editing configuration files manually (using vi, or some other editor in OpenWRT)
  • use OpenWRT Backfire 10.03.1 on your router (which can probably be any router capable of running OpenWRT)
  • have some understanding of what you are about to do and why
  • have a public (but not necessarily static) IPv4 address

If you mess up your firewall rules, worst case you can not log in to your router or you expose your entire network to the world. Proceed at your own risk.

At some point you will start trying your IPv6 connectivity. I suggest using test-ipv6.com, ipv6-test.com and ipv6.google.com.

A good start is the OpenWRT IPv6 Article (it contains much information, but it is not very well structured). First follow the 6to4, 6rd instructions (down to the firewall rule, which is probably fine, but I dont need it).

You also need to enable IPv6 forwarding (which is described in the 6in4 section).
edit /etc/sysctl.conf:

net.ipv6.conf.all.forwarding=1

Then

/etc/init.d/sysctl restart

Now you should start testing what works and what does not. Run ifconfig both on the router and on your local machine (ipconfig on Wintendo). If you have a reasonably new OS, you should now at least have an IPv6-address, even if you cant ping6 or connect to anything.

Note: Your 6to4 IP should start with 2002: (both router and clients). Addresses starting with fe80: are private addresses and completely useless.

Firewall
You probably have a Masquerading firewall configured for IPv4, but if you bother with IPv6 at all you probably don’t want to do Masquerade for IPv6 (dont know if it is possible).

I wanted my IPv4 to work just normally. And I wanted all my LAN-computers to be real IPv6 members accessible from the IPv6 internet (and protected by firewall, as needed, of course). That means, all replies from Internet should be fine, but incoming traffic from Internet should be restricted. The most natural thing would be to use connection tracking, but I encountered problems.

This is what my firewall configuration looks like now:
/etc/config/firewall

config 'defaults'
	option 'input' 'DROP'
	option 'output' 'ACCEPT'
	option 'forward' 'DROP'
	option 'syn_flood' '1'
	option 'drop_invalid' '1'
	option 'disable_ipv6' '0'

config 'zone'
	option 'name' 'lan'
	option 'network' 'lan'
	option 'input' 'ACCEPT'
	option 'output' 'ACCEPT'
	option 'forward' 'REJECT'
	option 'mtu_fix' '1'

config 'zone'
	option 'name' 'wan'
	option 'network' 'wan'
	option 'family' 'ipv4'
	option 'masq' '1'
	option 'output' 'ACCEPT'
	option 'forward' 'DROP'
	option 'input' 'DROP'

config 'zone'
	option 'name' 'wan6'
	option 'network' '6rd'
	option 'family' 'ipv6'
#	option 'conntrack' '1' 
	option 'output' 'ACCEPT'
	option 'forward' 'DROP'
	option 'input' 'DROP'

config 'forwarding'
	option 'src' 'lan'
	option 'dest' 'wan'
	option 'family' 'ipv4'

config 'forwarding'
	option 'src' 'lan'
	option 'dest' 'wan6'
	option 'family' 'ipv6'

config 'include'
	option 'path' '/etc/firewall.user'

config 'rule'
	option 'target' 'ACCEPT'
	option '_name' 'IPv6 WRT54GL ICMP'
	option 'src' 'wan6'
	option 'proto' 'icmp'
	option 'family' 'ipv6'

config 'rule'
	option '_name' 'IPv6: Forward ICMP'
	option 'target' 'ACCEPT'
	option 'family' 'ipv6'
	option 'src' 'wan6'
	option 'dest' 'lan'
	option 'proto' 'icmp'

config 'rule'
	option '_name' 'IPv6: WRT54GL "reply" to 1024+'
	option 'target' 'ACCEPT'
	option 'family' 'ipv6'
	option 'src' 'wan6'
	option 'dest_port' '1024-65535'
	option 'proto' 'tcp'

config 'rule'
	option '_name' 'IPv6: Forward "reply" to 1024+'
	option 'target' 'ACCEPT'
	option 'family' 'ipv6'
	option 'src' 'wan6'
	option 'dest' 'lan'
	option 'dest_port' '1024-65535'
	option 'proto' 'tcp'

Some comments on this:

  • I think it makes sense to think about IPv6 Internet as a separate wan6, not as part of wan
  • Incoming traffic is forwarded, as long as it is to unpriviliged ports (1024+)
  • ICMP works between everyone
  • The firewall.user script contains nothing of interest for IPv6
  • Masquerade is activated for wan, but conntrack (or masquerade) does not work for wan6
  • I have not needed a rule to allow INPUT protocol 41 to the router itself (the 6to4 traffic over IPv4), perhaps it gets allowed as ESTABLISHED,RELATED

Bridging and Connection tracking problems
I believe my configuration is working properly. But something is not completely right. Loading the firewall…

root@OpenWrt:~# /etc/init.d/firewall restart
Loading defaults
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
Loading synflood protection
Adding custom chains
Loading zones
Loading forwardings
Loading redirects
Loading rules
Loading includes
Loading interfaces
ip6tables: No chain/target/match by that name.

In the end of OpenWRT IPv6 documentation:
Note: firewall v1 (e.g. still in Backfire 10.03.1-rc4 and up to r25353) has no default rules at all and ip6tables configuration needs to be done from scratch. Insert the rules below to make the packet filter function properly.

ip6tables -A FORWARD -i br-lan -j ACCEPT
ip6tables -A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
ip6tables -A FORWARD -j REJECT

Well, I should be on a more recent version (10.03.1) but the second line (with conntrack) gives the No chain/target/match by that name error. I don’t know why, and I don’t know how to fix.

Also, in the same document, under the heading Directly forward ISP’s NDP proxy address to LAN there are instructions for “firewalling on ipv6 even for bridged interfaces”. I believe that this is what I want to do, but the ebtables package/module seems to not be available for WRT54GL/Backfire 10.03.1/brcm-2.4, and it also seems to be known to cause performance problems.

Either:

  1. I messed something up when installing/configuring OpenWRT, and now I dont know how to fix it
  2. Something IPv6-related that I want to do is not fully supported on Backfire/brcm-2.4
  3. I am just trying to do the wrong thing, without understanding it

Other config files
In case it is helpful to anyone (and possibly myself in the future) I post a few of my configuration files.

/etc/sysctl.conf (there are more lines)

net.ipv6.conf.default.forwarding=1
net.ipv6.conf.all.forwarding=1

/etc/config/network (all file)

config 'switch' 'eth0'
	option 'enable' '1'

config 'switch_vlan' 'eth0_0'
	option 'device' 'eth0'
	option 'vlan' '0'
	option 'ports' '0 1 2 3 5'

config 'switch_vlan' 'eth0_1'
	option 'device' 'eth0'
	option 'vlan' '1'
	option 'ports' '4 5'

config 'interface' 'loopback'
	option 'ifname' 'lo'
	option 'proto' 'static'
	option 'ipaddr' '127.0.0.1'
	option 'netmask' '255.0.0.0'

config 'interface' 'lan'
	option 'type' 'bridge'
	option 'ifname' 'eth0.0'
	option 'proto' 'static'
	option 'netmask' '255.255.255.0'
	option 'ipaddr' '192.168.8.1'

config 'interface' 'wan'
	option 'ifname' 'eth0.1'
	option 'proto' 'dhcp'

config 'interface' '6rd'
	option 'proto' '6to4'
	option 'adv_subnet' '1'
	option 'adv_interface' 'lan'

/etc/config/radvd (all other configs have option ignore 1)

config interface
	option interface	'lan'
	option AdvSendAdvert	1
	option AdvManagedFlag	0
	option AdvOtherConfigFlag 0
	list client		''
	option ignore		0

And a few packages that you should probably have installed in OpenWRT:

6to4
firewall
ip
ip6tables
kmod-ip6tables
kmod-ipv6
radvd
libip6tc

DHCP & DNS
I have not enabled any (IPv6) DHCP – autoconfigure works fine for me. I have also not configured anything DNS related. My normal DNS resolves IPv6-only hosts ok (i.e. ipv6.google.com).

The day I want to allow incoming traffic to just a few of my local/LAN machines I will have to think about it.

Troubleshooting
The following tools/strategies have proven useful for troubleshooting:

  • ping6 between router and local/LAN machines
  • ping6 to internet hosts (ipv6.google.com)
  • Disable firewall or set policies to ACCEPT
  • Send/receive TCP traffic using ncat (the best nc/netcat) version for OpenWRT.
  • Test ping/ncat to/from an IPv6 host on a different network – I installed miredo on my Lubuntu netbook and let it connect to internet via my iPhone. That way it had no shortcut at all to my router and LAN.
  • I find myself having more success when I unplug my router to restart it; just restarting makes it not come up properly.

ncat
In case you are not familiar with ncat:

On the router (start listening):

root@OpenWrt:~# ncat -6 -l -p 9999

On your local computer (send a message):

$ echo 6-TEST | nc 2002:????:????:1::1 9999

On the router (should have got message):

root@OpenWrt:~# ncat -6 -l -p 9999
6-TEST

This is useful all directions, and on different ports, to confirm that your firewall works as you expect.

USB Drives, dd, performance and No space left

Please note: sudo dd is a very dangerous combination. A little typing error and all your data can be lost!

I like to make copies and backups of disk partitions using dd. USB drives sometimes do not behave very nicely.

In this case I had created a less than 2GB FAT32 partition on a USB memory and made it Lubuntu-bootable, with a 1GB file for saving changes to the live filesystem. The partition table:

It seems I forgot to change the partition to FAT32, but it is formatted with FAT32 and that seems to work fine 😉

$ sudo /sbin/fdisk -l /dev/sdc

Disk /dev/sdc: 4004 MB, 4004511744 bytes
50 heads, 2 sectors/track, 78213 cylinders, total 7821312 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000f3a78

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *        2048     3700000     1848976+  83  Linux

I wanted to make an image of this USB drive that I can write to other USB drives. That is why I made the partition/filesystem significantly below 2GB, so all 2GB USB drives should work. This is how I created the image:

$ sudo dd if=/dev/sdb of=lubuntu.img bs=512 count=37000000

So, now I had a 1.85GB file named lubuntu.img, ready to write back to another USB drive. That was when the problems began:

$ sudo dd if=lubuntu.img of=/dev/sdb
dd: writing to ‘/dev/sdb’: No space left on device
2006177+0 records in
2006176+0 records out
1027162112 bytes (1.0 GB) copied, 24.1811 s, 42.5 MB/s

Very fishy! The write speed (42.5MB/s) is obviously too high, and the USB drive is 4GB, not 1GB. I tried with several (identical) USB drives, same problem. This has never happened to me before.

I changed strategy and made an image of just the partition table, and another image of the partion:

$ sudo dd if=/dev/sdb of=lubuntu.sdb bs=512 count=1
$ sudo dd if=/dev/sdb1 of=lubuntu.sdb1

…and restoring to another drive… first the partition table:

$ sudo dd if=lubuntu.sdb if=/dev/sdb

Then remove and re-insert USB Drive, make sure it does not mount automatically before you proceed with the partition.

$ sudo dd if=lubuntu.sdb1 if=/dev/sdb1 

That worked! However, the write speed to USB drives usually slow down as more data is written (in one chunk, somehow). I have noticed this before with other computers and other USB drives. I guess USB drives have some internal mapping table that does not like big files.

Finally, to measure progress of the dd command, send it the signal:

$ sudo kill -USR1 <PID OF dd PROCESS>

Above behaviour noticed on x86 Ubuntu 13.10.