Archive
Slow disk access and overall speed issues in newer ubuntu and linux mint
I have the subjective feeling that the newer versions of ubuntu and its derivatives including linux mint has slowed up a bit compared to pre unity ubuntus(11.04) or linux mint 17. I had been looking for the reason for some time. This probably won’t apply to you unless you are using a slow ‘hard disk’. The symptom is that subjectively the disk access latencies are larger (eg. applications take longer to start).
Linux has a software component called the IO scheduler. This takes care of scheduling requests to the permanent storage devices. There are a few choices for the algorithm to be used for this purpose. The choices and current selection can be seen by issuing
cat /sys/block/sda/queue/scheduler
(For me the sda is the drive containing all linux partitions. It may change for you ie. sda, sdb, sdc.. or hda, hdb, hdc…)
The default in modern ubuntu and linux mint is the ‘deadline‘ scheduler; probably a decision based on benchmark performance results on modern hardware. I changed this to ‘cfq‘, and subjectively the system became a lot more responsive. You can temporarily tryout the setting by writing the new scheduler name to the above file using command
sudo sh -c 'echo cfq > /sys/block/sda/queue/scheduler'
If you find think system responsiveness is better than before, you can make the changes permanent by writing following settings to /etc/default/grub and running update-grub2
sudo vim /etc/default/grub
Type the setting
GRUB_CMDLINE_LINUX="elevator=cfq"
Exit editor.
update-grub2
From next reboot onwards the setting should now takeplace at boot itself.
Area vs Time: A look at algorithmic complexity in Hardware and Software
Many of us would have studied in our college that implementing something in hardware will be faster than implementing the same in software. What i see from this situation is a time complexity being converted into area complexity. And the benefits provided by pipelining, and reduced control overhead resulting from special purpose control implementation (As opposed to more generic&bulky control logic required by a general purpose CPU).
My own recent expeditions in this area suggest something that is not so obvious. The inverse proportion between Logic Area and Time starts after a discontinuity. Some algorithms will benefit substantially if a very minimal part of the algorithm is implemented in hardware. Beyond that point the Area vs Time relation will be an inverse proportional relation.
I would call these minimal logic that brings in huge value add as a ‘primitive function’. Now a little bit of theory. A general purpose processor does some sort of data manipulation function (eg. add, subract, multiply, divide, compare shift etc..) between two registers. Basically for a simple RISC 32 bit processor, whose instruction set is capable of taking 2 32 bit register operands and producing 1 32 bit register output, there is some combinational logic between the 64bit input and 32 bit output, depending on the current instruction. The typical instruction set of a RISC is <256 instruction. Is this enough to do all the possible function mapping from a 64 bit input to a 32 bit output domain?. No certainly not. You can always have one function that mirrors one of the 32 bit input to the output.
So for any of the general purpose processors, it is not possible to implement all the possible function mapping from a 2 word space to a 1 word symbol space. Instead it just implements the most commonly used function mappings like add, subtract, boolean operators, shifts etc. If you take the case of mirroring of bits in a 32 bit word, it will sure take at-least 32 cycles to do it in a general purpose CPU. But implementing a mirroring in hardware doesn’t consume much logic. But instantly reduces the time required for computation of the mirror by a large amount.
So if you are tasked with implementing custom hardware constrained by area, the first thing to ask is: is my CPU capable of supporting all the primitive logic function, relevant for the algorithms in use.
TODO: To be updated with some rough sketches indicating my idea
Getting FreeRTOS to work with GCC and LPC2129
FreeRTOS is a good OS to start with, if you have some fairly good hardware, like an LPC2129, ARM based chip. FreeRTOS is aimed to work with different processor architectures. The OS has two parts – a set of architecture independent codes, and a set of architecture specific codes called ports. The syntax for non standard functionalities like interrupt handling and inline assembly will be different for different compilers. Interrupt Service Routines and inline assembly are typically necessary for the architecture dependent files. Therefore, a port will be specific to a compiler and a target chip. But the present version of FreeRTOS doesn’t have a port for LPC2129 – GCC combination.
I assume that you have a working gcc compiler and other tools already installed. Refer to ARM Development under Ubuntu 10.04, for how to setup tools under Linux. Download FreeRTOS from sourceforge.net. Unzip into a convenient location.
The files you will need to compile are
- list.c, queue.c, task.c and croutine.c from Source/
- port.c and portISR.c form Source/portable/GCC/ARM7_LPC2000
- heap_2.c from Source/portable/MemMang
- A file containing the main() routine
- boot.s and lpc2106-rom.ld from Demo/ARM7_LPC2106_GCC
- LPC21xx.h from Demo/ARM7_LPC2138_Rowley
Rename lpc2106-rom.ld to LPC2129-ROM.ld, and open it using text editor. You will see the following at the start of the file.
Change these lines as displayed in the following figure. The size of the Flash and RAM are modified to that of the LPC2129.A simple main file will look like
#include “FreeRTOS.h”
#include “task.h”int main(void)
{unsigned char para0, para1;
para0 = 0;
xTaskCreate( foobar, “NAME”, configMINIMAL_STACK_SIZE, ¶0, tskIDLE_PRIORITY, ( xTaskHandle * ) NULL );
para1 = 1;
xTaskCreate( foobar, “NAME”, configMINIMAL_STACK_SIZE, ¶1, tskIDLE_PRIORITY, ( xTaskHandle * ) NULL );
vTaskStartScheduler();
while(1);}
void foobar( void* pvParameters )
{unsigned char *flag;
flag=(unsigned char*)pvParameters;
while (1){if(*flag==0){
//do something
}
else{//do something else
}
}
}
(Thanks to my friends Abhishek and Nisarg for the original main file)
Now you can compile and link all the files specified above, along with the libc.a , specified by linker option -lc, and chip specification -TLPC2129-ROM.ld. Also add Source/include directory to your include/compiler search path.
The generated elf file can be converted to hex file using arm-elf-objcopy command (if you have followed ARM Development under Ubuntu 10.04 ). The program can be dumped into the LPC2129 chip through the serial connection using command lpc21isp
eg:- lpc21isp -control -hex trial.hex /dev/ttyS0 9600 12000 will program trial.hex into the chip, through serial port ttyS0 at 9600 baud per second. 12000 is the frequency of the crystal (in kHz) used for the LPC2129 chip.
JACK Audio Connection Kit
Last weekend, I was trying out the JACK Audio Connection Kit on Linux Environment. They say its not limited to Linux and is/will be available for other UNIX like OS, Windows & Mac. I was lazy and involved with some HTML learning, so I could only understand and execute the basic program provided with it. Coming weekend I may try it for other programs and give you a better article. The following are a few of the things I noticed.
- JACK is a very simple & intuitive API, for routing & processing digital audio, in realtime.
- JACK is meant for professional audio, and not very good for very low processing power embedded sytems. May be you can do simple processings with an ARM.
- JACK is in concept similar to analog modular synthesisers, where you connect one/more modules (eg: sound generators) to one/more other modules (eg: filters), using patch cords. The modules being replaced by software programs(like music players, your own programs etc.), and the interconnection of each module specified in the programs.
- You can implement your own realtime Audio Processing Algorithm, in C using JACK API.
- The API is huge and repel many amateurs. The fact is that with 10 to 12 functions, you can build fairly complex audio softwares.
You can expect more of my experiences with JACK in the weeks that follow.