Sunday, December 30, 2007

Fixed an error in SIFT code

When compiling SIFT code in MATLAB using sift_compile, I got error message:


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Trying LAPACK lib 'C:\Program
Files\MATLAB\R2007a\extern\lib\win32\microsoft\libmwlapack.lib'
Writing library for siftrefinemx.mexw32
c:\users\mike\appdata\local\temp\mex_a97b690e-eec3-47be-bfb2-13aa33a93822\s
iftrefinemx.obj
.text: undefined reference to '_dgesv'

C:\PROGRA~1\MATLAB\R2007A\BIN\MEX.PL: Error: Link of
'siftrefinemx.mexw32' failed.

??? Error using ==> mex at 206
Unable to complete successfully.

Error in ==> sift_compile at 56
mex('siftrefinemx.c',opts{:}) ;

~~~~~~~~~~~~~~~~~~~~~~~~~~~

I'm using Visual Studio 7 and MATLAB 7

The problem can be solved by adding

__declspec(dllimport) ahead of the function declaration of dgesv:

#ifdef __cplusplus__
extern "C" {
__declspec(dllimport) extern int DGESV(int *n, int *nrhs, double *a, int *lda,
int *ipiv, double *b, int *ldb, int *info) ;
}
#else
__declspec(dllimport) extern int DGESV(int *n, int *nrhs, double *a, int *lda,
int *ipiv, double *b, int *ldb, int *info) ;




Saturday, December 29, 2007

MATLAB: Debugging C Language MEX-Files

http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_external/f32489.html&http://www.google.com/search?client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&channel=s&hl=en&q=debug+MATLAB+mex+C&btnG=Google+Search

Simple way to doubleSize or shrink an Image by half

Excerpt from Andrea Vedaldi's code:

function J = doubleSize(I)
[M,N]=size(I) ;
J = zeros(2*M,2*N) ;
J(1:2:end,1:2:end) = I ;
J(2:2:end-1,2:2:end-1) = ...
0.25*I(1:end-1,1:end-1) + ...
0.25*I(2:end,1:end-1) + ...
0.25*I(1:end-1,2:end) + ...
0.25*I(2:end,2:end) ;
J(2:2:end-1,1:2:end) = ...
0.5*I(1:end-1,:) + ...
0.5*I(2:end,:) ;
J(1:2:end,2:2:end-1) = ...
0.5*I(:,1:end-1) + ...
0.5*I(:,2:end) ;

function J = halveSize(I)
J=I(1:2:end,1:2:end) ;

Thursday, December 27, 2007

Worth Reading

Yan Ke, PCA-SIFT: A More Distinctive Representation for Local Image Descriptors.
comment: use PCA to capture and maximum distinctness of SIFT descriptors, simple but sharp.

Worth Reading

Pedro F. Felzenszwalb, Pictorial Structures for Object Recognition.
comment: introduces a tree-like pictorial structure and an efficient recursive matching algorithm based on dynamic planning.
Also a part-dependence-selection algorithm based on MST(minimum spanning tree) is proposed.
But it turns out the building of the pictorial structure is not fully automatic and the dependence-selection is not used in the experiment.

Monday, December 24, 2007

Spatial and temporal distribution of piotr's feature


A statistics shows that piotr's feature is well distributed in the video volume data.

The whole 3D space is populated with piotr's features.

I guess I don't have to worry about this quality now.

Friday, December 21, 2007

A clever way to convert monotone video to a color one

I is a monotone video with dimension [width, height, length];

V = permute(I, [1,2,4,3] );%size(V)=[width,height,1,length];
V = repmat( V, [1,1,3,1] );%size(V)=[width,height,3,length];

V is a 3-channel video (every channel is the same), though.

Thursday, December 20, 2007

Worth Reading

I decided to clean up my closet, so here comes the paper which I read a month ago:

Piotr Dollar, Behavior Recognition via Sparse Spatio-Temporal Features
comment: extend LoG to 3D video sequence, computing derivative through 2 spatial and 1 temporal dimension, find salient cubes. His toobox actually has a set of descriptors, worth playing.

Ahmed Elgammal, Inferring 3D Body Pose from Silhouettes using Activity Manifold Learning.
comment: an interesting idea. Find the manifold where continuous human motion lie and try to learn the 'inverse-projection' from this lower-dimension space to higher 3D projection.
However, there should be some fundamental problems: how to detect and solve the self-crossings?

David Beymer, Image Representations for Visual Learning.
comment: a pioneering work which learning the non-linear corresponding between 3D and 2D spaces of object model and images, which actually gives idea to Elgammal's paper.

David Heckerman, A Tutorial on Learning With Bayesian Networks
comment: as the topic says, a good tutorial on LEARNING with Bayesian Networks, not INFERRING. For inferring with Bayesian Networks, you have to look at other tutorials (Bishop, 2006, a very good book).

Michael Isard, CONDENSATION-conditional density propagation for visual tracking.
comment: A must read paper and tutorial for CONDENSATION

Dick de Ridder, Locally linear embedding for classification.
comment: A quite readable tutorial and tech-report for LLE

Sam T. Roweis, Nonlinear Dimensionality Reduction by Locally Linear Embedding.
comment: original Science paper for LLE

Joshua B. Tenenbaum, A Global Geometric Framework for Nonlinear Dimensionality Reduction.
comment: original Science paper for ISOMAP. published back to back with Sam's LLE paper.

David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints.
comment: original SIFT paper.

W.R. GILKS, Adaptive Rejection Sampling for Gibbs Sampling
comment: a clever modifcation of Gibbs sampling algorithm that bounds the sampling region adaptively to improve efficiency, quite interesting.

Weiming Hu, A Survey on Visual Surveillance of Object Motion and Behaviors.
comment: extensive survey.

Aaron F. Bobick, The Recognition of Human Movement Using Temporal Templates.
comment: original paper for Spatial-Temporal templates.

Haibin Ling, Diffusion Distance for Histogram Comparison.
comment: interesting algorithm to compare histograms using heat diffusion process simulation, which is quite reasonable.

Yossi Rubner, The Earth Mover's Distance as a Metric for Image Retrieval.
comment: EMD for histogram comparing, see Haibin Ling's paper.

Jianbo Shi, Good Feature to Track.
comment: very old tech report on Feature Detection.

Shivani Agarwal: Learning to Detect Objects in Images via a Sparse, Part-Based Representation.
comment: using bag-of-words strategy to detect objects. coocurrance and spatial information are encoded to train a SNOW classifier.

Thursday, December 13, 2007

Worth Reading

David Liu, A Topic-Motion Model for Unsupervised Video Object Discovery.
comment: An extension of paper 'Josef Sivic, Discovering objects and their location in images.' to videos.

Wednesday, December 12, 2007

Worth Reading

Shai Ben-David, A Sober Look at Clustering Stability.
comment: clever definition for the stability of clustering algorithms.

Sunday, December 09, 2007

Worth Reading

Dec. 9 th, 2007
Josef Sivic, Discovering objects and their location in images.
comment: migrating the pLSA (probabilistic latent semantic analysis) from natural language processing domain to image analysis domain. Using bag-of-words strategy to discover multiple objects automatically in a bunch of images.

Thomas Hofmann, Unsupervised Learning by Probabilistic Latent Semantic Analysis.

Thomas Hofmann, Probabilistic Latent Semantic Indexing.

Lihi Zelnik-Manor, Self-Tuning Spectral Clustering.
comment: It's about spectral clustering, with automatic parameter tuning.

Scott Deerwester, Indexing by Latent Semantic Analysis.
comment: A good tutorial for SVD application in semantic analysis and factor analysis. Providing a compact and flexible framework for associating different objects. esp. page 13-14.

Friday, August 17, 2007

Wednesday, April 04, 2007

success rate of 9.999999999==0








I'm traing two terriers recently.

The first one is Harr-Casade to detect '0' numbers in a digital meter, and the second one, called Bayes, to read them...

They did farely well. But 100% of accuracy is what I really want.




Thursday, January 25, 2007

I'm working on plan B, C, D, E ....



Watermeter reading might be the hardest case I'd ever encountered.

I tryied

Recognition, Principle Axis estimation, Tracking, Single Tone Frequency Estimation, and all of the combination of these strategies.

The rate of successful read is still no more than 95%

Wednesday, January 10, 2007

counting the periods of a signal

This is a noisy wave. And I want to count the periods of its fundamental component.

The resolvement should be around 0.2 T(T is the period of the curve).





The fundamental component can be extracted using DCT or FFT band filtering.
But, as discrete transforms can only archieve integer resoultion in the frequency domain. So the filtered curve lost its period resolution in the time domain.
I tried to fit a curve in the frequency domain to locate the maximum to a 'sub-integer' level, but that only leads to 0.5 Hz improvement, and its not a reliable solution.

Must I use continuous cosine or fourier transform? Or better idea?

What I mean is if the length of this signal is N, then in the frequency domain, we got only N points. 1/N, 2/N,...k/N.
For example (in matlab),
x=[-pi:0.01:pi];
y=cos(x);
dc=dct(y);

plot(dc(1:10));
the singnal y has one period, and its DCT has a peak exactly at dc(3);

For:y1=cos(1.5*x);
plot(y1);
The signal has 1.5 periods
And y2=cos(1.2*x); plot(y2); The signal has 1.2

periods.

y1,y2 now has fractional periods, DCT of them have similar shape with positive and negtive peaks at dc(3) and dc(5), and other small peaks. Only the amplifications are different.
So, DCT can only describe this fraction by noisy data on other subwaves (see pictures below).
How do I detect this fraction(0.5 and 0.2) directly from the frequency domain?
dc=dct(y); plot(dc(1:10));




VMM: registeration of images.

transferred from http://smartnose.blogeasy.com
Originally posted on Mon Sep 18, 2006 7:42 pm

When the user measure targets of the same type, it would be cool if the machine would remember the first target and compare the rest of them to it and automatically find the interested edges. Image registration is employed to find the correspondence between the first target (template) and the rest of them. For VMM, the transform is apparently rigid, not even affine. So it should be an easy task, but two types of targets should be considered. (1) targets with much texture information, like printed PCB board and colored particles. (2) targets with only edge information, such as plugins and little mechanic parts. I'm now testing two strategys, the first one is used in my MSc thesis to estimate camera motion, i.e. identifying some lanmark points, match them and then compute the affine transformation; the second strategy is based on Chamfer Matching (borgerfors, 1988). The first strategy proves to be very accurate on PCB board images, but has poor performance on the second type of targets. The algorithms include: (1) detect landmarkers (something like cvGoodFeaturesToTrack) (2) match the landmarkers by neighborhood correlation (3) compute the affine transformation by iterated outlier removing. To speed up the computation, I did this on a image piramid with different image resolution. Firstly compute the affine model on the coasest image, then localize step(2) according to the model on a finer image and so on. The average registration error ( defined in many literatures, it's not convinient to input formulas in a blog) decreases each level on the piramid. In the example below, the average error is 60, 53, 26 for each layer in the herachical structure respectively.

BTW: some literatures prefer methods in a optical flow fashion, but these ideas are unsuitable for VMM: the fundamental hypothesis that the image sequence would be 'continuous' doesn't hold here. When a user puts on targets on the platform, a minimum offset would cause the image shift a lot, so the source and target images won't be close spatially at all. For optical flow computation, a very large scan window should be used here, which is undesirable both in time and accuracy.

fig. first layer of the casade











fig. Second layer of the casade

Visual (or Vision?) Measuring Machine: Recognition Refined By RANSAC

Transfered from http://smartnose.blogeasy.com

originally posted on Mon Sep 18, 2006 7:06 pm

Things become ugly when they are zoomed very large, curves can be zig-zag and straight lines are not straight anymore. Hough algorithm won't solve all problems. They are good at identifying parametric shapes, even the shapes are broken or not complete. But when the shape itself cannot be approximated by parametric equations, Hough algorithms would find many scattered local maxima in the parameter space. So when the images become ugly, we gotta fit something instead of detect a nice shape from the image. To cope with the noise and to prune adjuncted edges from what we want, an iterated fitting strategy should be used to remove those points with large residuals with decreasing thresholds. The steps are: (1) track the edges (2) fit it to line/circle/ellipse (3) prune the points that has larger residual than current threshold (4) decrease the threshold and goto (2)

Visual Measuring Machine: Geometry Recognition

transferred from http://smartnose.blogeasy.com, posted on Tue Jul 18, 2006 12:02 am

I'm now working on a Visual Measuring Machine. It's something like an automatic microscope. With a powerful optical system, it can zoom the image for many times. After snapping a picture, a user would point out which edge he/she wants to measure, then the software system needs to recognize the simple geometries like lines and circles in the picture and locate their end points. After obtaining these metrics in pixels, its real length could be computed using the scale factor. The user can, of course, manually point out the line on this picture, but that would be a boring task. As the targets being measured are usually small and are manufactured according to CAD scripts. So, a vectorization of the picture is often desired and eligible. The development has 2 stages: 1. recognize lines and circles. To aid user with recognition power. 2. implement a full-scaled vectorization feature and using CAD data to check the vectorization result automatically. This would be a very useful feature. After a product is manufatured, its every edge and surface is reconstructed with the system and compared with its original design for deficiency. The first stage is almost done. It's relatively easy, anyway, here's the pictures.
















My thesis:Multi-target counting based on image sequence fusion

Transferred from http://smartnose.blogeasy.com. posted on Thu Nov 3, 2005 1:56 am





Project: Recognize this?






Originally posted on Tue May 10, 2005 1:39 am
























Project: online steel bar counter

These are steel bars running on the product chain at speed 5m/s. We use ccd camera to recognize and track them. When the number of passed steel bars reached to the predefined one, we stop the product chain and indicating the workers to wrap them. Our algorithm has realtime execution speed(processing 25 frames in 1 second). But has one big problem: it might lose tracking of some steel bar when they vibrate sharply. Although the chance is very low (once per day), it is still annoying. We are working on this problem. After the counting is done, we will work out a plan to wrap the steel bars by robot hand. This require very good tracking algorithm and control strategy. We gotta simulate human hand to do this job.

Dating with Microsoft Research Asia!!!!!!!!!!!!!!!!


transfered from http://smartnose.blogeasy.com. Originally posted on Tue Dec 6, 2005 12:33 am


After taking a ferocious screening test, the Heracles of Software Industry, Microsoft, offered me a free 3 days round trip to Beijing for visit to Microsoft Research Asia. It's the first time for me to take on a flight. ^__^ And I'm gonna meet so many geeks there!

Project: offline steel bar counter

Transferred from http://smartnose.blogeasy.com, Fri May 6, 2005 8:26 pm

For an iron&steel company, sometimes it's nessesary to count the steel bars before wrapping them up. Say, the clients might want 100 pieces a bunch, then you gotta wrap them properly. You wrap 102 a bunch, you pay for 2 pieces, while you wrap 98 a bunch, you lose your job permanently. As in before, these jobs are done manually by several people, each one take a peil of paint and count the bars one by one, using a brush to mark them. We make things simple. With a hand-held computer and a web camera, we snap a picture of these bars, draw a blue line to circle out the bars that you wanna count and our system count the steel bars automatically. The project is closed and the system works perfectly in Lian Yuan iron&steel company.


This is a snapshot of our software. I add a small zoom window on the right, so users can zoom in and zoom out to check if the counting result is right.




















Tuesday, January 09, 2007

Project: ceramic tile quality checking system


Transferred from http://smartnose.blogeasy.com. Posted on Tue Apr 19, 2005 9:03 am

Above is a ceramic tile with one of its border broken. Below is the broken edge that the system detects. The system can also detects the spots on the ceramic tile and check the quality of the ceramic tile automatically.

My work table