In case you wonder...

why it is so quiet here? The noise moved over to: http://paul.vc/+

Finished Dual Tube VCA Project

Yay! It's done. Lovely warm distortion and fits great into Eurorack / Doepfer setup. Check sound samples below.

PLS Synth Module Demos by pulsar

Kudos goes to: Ken Stone and Bill and Will over at Dragonflyalley

Dual Tube VCA - Work in Progress II

Just finished mounting the first VCA Module. YAY!

You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialise correctly.

and some waveforms. loving the results so far!

You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialise correctly.

Dual Tube VCA - Work in Progress

A little project I am currently working on:

You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialise correctly.

if WTF crosses your mind: it's gonna be a dual tube driven voltage controlled amplifier module for my eurorack / doepfer setup. Use google or youtube got get a bit less of that WTF (or more, depends on your tech-affinity). Credits: Ken Stone - http://www.cgs.synth.net/modules/cgs65_vca.html and Bill & Will - http://www.dragonflyalley.com/constructionCGS65tubeVCA.htm

Dual Tube VCA - Work in ProgressDual Tube VCA - Work in Progress

Bind hadoop to a specific network device

quick and dirty

hadoop-env.sh:

  1. #replace eth1:0 with your NIC / alias
  2. bind_ip=$(/sbin/ifconfig eth1:0 | grep 'inet addr:' | cut -d: -f2 | awk '{print $1}')
  3.  
  4. export BIND_OPTS="-Dlocal.bind.address=${bind_ip}"
  5.  
  6. # Command specific options appended to HADOOP_OPTS when specified
  7. export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS $BIND_OPTS"
  8. export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS $BIND_OPTS"
  9. export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS $BIND_OPTS"
  10. export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS $BIND_OPTS"
  11. export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS $BIND_OPTS"
append to hdfs-site.xml:
  1. <property>
  2. <name>dfs.secondary.http.address</name>
  3. <value>${local.bind.address}:50090</value>
  4. <description>
  5. The secondary namenode http server address and port.
  6. If the port is 0 then the server will start on a free port.
  7. </description>
  8. </property>
  9.  
  10. <property>
  11. <name>dfs.datanode.address</name>
  12. <value>${local.bind.address}:50010</value>
  13. </property>
  14.  
  15. <property>
  16. <name>dfs.datanode.http.address</name>
  17. <value>${local.bind.address}:50075</value>
  18. </property>
  19.  
  20. <property>
  21. <name>dfs.datanode.ipc.address</name>
  22. <value>${local.bind.address}:50020</value>
  23. </property>
  24.  
  25. <property>
  26. <name>dfs.http.address</name>
  27. <value>${local.bind.address}:50070</value>
  28. </property>
  29.  
  30. <property>
  31. <name>dfs.datanode.https.address</name>
  32. <value>${local.bind.address}:50475</value>
  33. </property>
  34.  
  35. <property>
  36. <name>dfs.https.address</name>
  37. <value>${local.bind.address}:50470</value>
  38. </property>
you will also need to hardcode the HDFS URL in the core-site.xml by hand. This is not an issue since the same config should be deployed on all nodes in the cluster. Same pattern can be applied to mapred-site.xml - I did not need it.

Load-Sensitive ThreadPool / Queue in Java

Here is a little code-snippet (actually a ready to run jUnit TestCase) which might come handy if you need a fairly open ThreadPool not primarily limited by the number of active threads but rather by a predicted load factor. Latter one might be pretty much everything such as CPU load or a total number of "items" allowed to be processed by the whole ThreadPool at a given time.

If the predicted load is not dynamic enough for you, you might want to add another monitoring thread looking at some indicators (CPU, RAM, I/O) and adjust the LoadTracker's currentLoad value accordingly. Another path would be to skip the monitoring thread and extend the canHandle(load) method of the LoadTracker to respect the current indicator states.

Oh, and please let me know if I am reinventing the wheel, sometimes it is difficult not to.

In retrospect, same pattern could be applied to the Queue beneath the ThreadPool by coupling a LoadTrackableJob with a specific BlockingQueue. I guess you can always make the code / architecture prettier.

  1. public class TestThreadPool extends TestCase
  2. {
  3. private static Log log = LogFactory.getLog(TestThreadPool.class);
  4.  
  5. public void testLoadTracker() throws InterruptedException
  6. {
  7. int maxRunningThreads = 128;
  8. int maxLoad = 500;
  9.  
  10. LoadTracker load = new LoadTracker(maxLoad);
  11. ExecutorService pool = Executors.newFixedThreadPool(maxRunningThreads);
  12.  
  13. for (int i = 0; i < 500; i++)
  14. {
  15. // here you would create your real job and *predict* its impact on the load factor.
  16. // we choose the load to be random.
  17. int predictedJobLoad = (int) Math.round(Math.random() * 10l);
  18. MyJob aJob = new MyJob(load, predictedJobLoad,"job-"+i,this);
  19.  
  20. while (!load.canHandle(predictedJobLoad))
  21. {
  22. log.debug(String.format("WAIT: current load %d and new job is about to be %d", load.get(), predictedJobLoad));
  23. synchronized (this) { this.wait(1000); }
  24. }
  25.  
  26. log.debug(String.format("QUEUE: current load is %d and new job is about to be %d", load.get(), predictedJobLoad));
  27. pool.execute(aJob);
  28. }
  29. pool.shutdown();
  30. pool.awaitTermination(42,TimeUnit.DAYS);
  31. assertEquals(0, load.get());
  32. }
  33.  
  34. private class MyJob implements Runnable
  35. {
  36. private LoadTracker loadTracker;
  37. private int load;
  38. private String jobId;
  39. private Object monitor;
  40.  
  41. public MyJob(LoadTracker loadTracker, int load, String jobId, Object monitor)
  42. {
  43. this.jobId = jobId;
  44. this.loadTracker = loadTracker;
  45. this.load = load;
  46. this.monitor = monitor;
  47. loadTracker.add(load);
  48. }
  49.  
  50. @Override
  51. public void run()
  52. {
  53. log.debug(String.format("RUN: %s with a load of %d", jobId, load));
  54. try
  55. {
  56. Thread.sleep((int) Math.round(Math.random() * 5000l));
  57. }
  58. {
  59. e.printStackTrace();
  60. }
  61. log.debug(String.format("FIN: %s with a load of %d", jobId, load));
  62. loadTracker.remove(load);
  63. if (monitor != null) synchronized (monitor) { monitor.notify(); }
  64. }
  65. }
  66.  
  67. private class LoadTracker
  68. {
  69. private int currentLoad = 0;
  70. private int maxLoad = 0;
  71.  
  72. public LoadTracker(int maxLoad)
  73. {
  74. this.maxLoad = maxLoad;
  75. }
  76.  
  77. private synchronized void add(int load)
  78. {
  79. this.currentLoad += load;
  80. }
  81.  
  82. private synchronized void remove(int load)
  83. {
  84. this.currentLoad -= load;
  85. }
  86.  
  87. public synchronized int get()
  88. {
  89. return currentLoad;
  90. }
  91.  
  92. public synchronized boolean canHandle(int additionalLoad)
  93. {
  94. return ((this.get() + additionalLoad) < maxLoad);
  95. }
  96. }
  97. }

_pulsar_: instant &lt;3: http://t.co/1FiXB0q - if you need to plot some data or a function - way to go!

_pulsar_: instant &lt;3: http://t.co/1FiXB0q - if you need to plot some data or a function - way to go! - _pulsar_: instant <3: http://t.co/1FiXB0q - if you need to plot some data or a function - way to go! [/me Twitters]

plenty of videos and slides straight from hadoop world

plenty of videos and slides straight from hadoop world: http://is.gd/h1YeD #base #hadoop #bigdata

OpenHUG Meeting

_pulsar_: OpenHUG Meeting, wir freuen uns auf zahlreiches Erscheinen! http://linkd.in/drEaBF || http://bit.ly/aERlUU || http://bit.ly/bNGGCt - _pulsar_: OpenHUG Meeting, wir freuen uns auf zahlreiches Erscheinen! http://linkd.in/drEaBF || http://bit.ly/aERlUU || http://bit.ly/bNGGCt [/me Twitters]

Why pigz freakin' rock(s)

I find myself quite often in the need to copy large textfiles over the network. Usually one would go with gzip, either transparently by using the compression switch on scp -C or by archiving a file before pushing it over the wire.

Turns out gzip can compress quite well, but it won't saturate your 100mbit line if you do something like this:

cat bigfile.txt | gzip -c | ssh me@other.side 'cat | gunzip -d > bigfile.txt'

While this has the same effect as scp bigfile.txt me@other.side: it will be helpful to understand the alternatives coming up next.

On a pretty decent machine (Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz) I could get about 60% saturation of the network link. So, what can we do? We could lower the compression level to take load of the CPU and shift it towards the network. We could also use an alternative compression algorithm such as LZO. "lzop" is a free implementation available in most common linux distributions, so this might be the easiest way to go:

cat bigfile.txt | lzop -c | ssh me@other.side 'cat | lzop -cd > bigfile.txt'

My initial tests shown that LZO compression level is about 20% lower than gzip's with default settings. Transfer time was almost cut in half on the other hand. So, how do we get 100% network saturation*AND* great compression? Pigz is a parallel gzip implementation, so instead maxing out only one thread as gzip does, it will use all the available cores and threads your fancy server provides. Downside? No debian stable repository packages available yet. But on the other hand: it does not even require a configure script, how about one header file and 2 "c" files. Most probably your remote connection to the server will take longer to refresh the console output than the compilation process itself.

So, emerge, apt-get install, port install or whatever "pv" and have some fun like this:

  1. me@host:/mnt/data1/import$ cat bigfile.txt | pv | pigz -c | ssh me@otherhost 'cat | unpigz > /mnt/data1/bigfile.txt'
  2. 1.83GB 0:00:18 [95.3MB/s] [ <=> ]

Some more numbers: gzip gives me 45 mb/s and lzop 60mb/s

I think "pv" stands for pipe view, it is responsible for the nice stats during the transfer. And yes, this *is* a 100mbit network connection pushing a textfile at ~100mb/s. Nice, isn't it?

You can find pigz over here

Syndicate content