Sunday, November 25, 2007

Top 10 Words in different langiuages!

Spanish: de, que, no, a, la, el, es, y, en, lo

French: de, la, le, et, les, des, en, un, du, une

German: der, die, und, in, den, von, zu, das, mit, sich

Italian: non, de, che, e, e', la, un, il, a, per

Dutch: de, en, het / 't, van, ik, te, dat, die, in, een

Sunday, September 9, 2007

Algorithm for Writing Better.

Below is the higher level step-by-step guide or an algorithm describing
how to write better in a thesis or paper.

Write, Correct and Repeat:
1. Write a first draft by organizing your thoughts.
2. Make static and dynamic level corrections.
3. Repeat the above two steps until the thesis/paper has a flow.
The flow should be as smooth as in Harry Potter's novel or a good movie!

Static (Sentence) Level Corrections:

Write simple sentences.
By writing complex and long sentences often you might want to convey the
impression that you are nerdy and brainy. However, people hate to read such
complex sentences. So the important point to remember is "write simple and
small sentences".

Remedy: If a sentence is complex and long then break into 2-3 simple
sentences.

A simple sentence should also be meaningful.
That means the reader should not be forced to refer to previous sentences
to understand the current sentence. A better example would be a sentence
with third persons "it, they, their's etc" where the reader has to go back and
understand what "it / they " refers to. When you read a sentence in isolation
they should hold some meaning and provide some information independent
of other sentences.

Note that every sentence should be one of the following.
- Logical argument
- Valid truth which could have been obtained from other researchers' results
- An observation or conclusion from your own experiments.

Delete is the greatest mantra for better writing:
If a simple sentence doesn't hold any meaning in isolation or if it is not a
logical argument / valid truth / observation then delete it. Delete is the
greatest mantra for a better writing.

If a sentence has passed the litmus tests listed in steps 1-4, then perform
a routine English grammatical and spelling errors.

Grammar Errors:
In each sentence, grammatical errors have to be identified and corrected.
Friends and colleagues can help in this part.

Spell Errors:
Spell-Checking is most important. Often spelling errors in a thesis/paper
gives extremely bad impression about the author.

Dynamic Level Corrections:
Sentence Level:
There should be a continuity (reading flow) between two sentences
in a paragraph. The reader shouldn't get disconnected between
two sentences.

Paragraph Level:
No two successive paragraphs should seem disconnected in a section.

Section Level:
Every section is a larger logical block of your thesis/paper
and hence all sections should be logical organized and interleaved
(threaded) in a nice fashion.

Chapter level:
There should also be some link between two chapters. Often these
links may be a little weak. However, the reader should logically expect
the next chapter by the time he/she reads the last section in the
current chapter.

Chapter and Section Title:
The titles of section and chapters carry most important information.
By reading the titles of chapters and sections the reader should get
an idea of your main point in the thesis or paper. Vague titles such
as "Literature Review", "Introduction", "Application on Continuous
Speech" wouldn't convey much information to the readers.
Rather they should be specific such as "Literature Review on
Speech Recognition", "Introduction to Speech Synthesis" and
"Application of Pronunciation Modeling on Continuous Speech".

How to Start:
1. Pen down your thoughts in terms of chapters and sections and
arrange them as table of contents (TOC) for your thesis or paper.
That where I ask for TOC as an overview for a thesis/paper.
TOC should be in detail at section and subsection level.,

2. Given TOC, write 3-5 lines in each section and subsection.
Identify and write the main points you want that section and
subsection to cover.

3. Having gone through section and subsections in all
chapters, now revisit the first chapter, section. Expand the main
points of each section into 20-40 simple sentences.

4. Now start "Write, Correct and Repeat" process.

Sunday, August 26, 2007

Memory Alocation for 2D arrays in C++

Type** twodname;
twodname = new Type*[dimension1];

for (int d1 = 0; d1 <
dimension1; d1++) {
twodname[d1] = new Type[dimension2];
}

Releasing array:

for (int d1 = 0; d1 <
dimension1; d1++) {
delete [] twodname[d1];
}
delete [] twodname;

Why Object Oriented Approach?

Why Object Oriented approach?

A major factor in the invention of Object-Oriented approach is to remove some of the flaws encountered with the procedural approach. In OOP, data is treated as a critical element and does not allow it to flow freely. It bounds data closely to the functions that operate on it and protects it from accidental modification from outside functions. OOP allows decomposition of a problem into a number of entities called objects and then builds data and functions around these objects. A major advantage of OOP is code reusability.

Some important features of Object Oriented programming are as follows:

  • Emphasis on data rather than procedure
  • Programs are divided into Objects
  • Data is hidden and cannot be accessed by external functions
  • Objects can communicate with each other through functions
  • New data and functions can be easily added whenever necessary
  • Follows bottom-up approach

Concepts of OOP:

  • Objects
  • Classes
  • Data Abstraction and Encapsulation
  • Inheritance
  • Polymorphism

Briefly on Concepts:

Objects

Objects are the basic run-time entities in an object-oriented system. Programming problem is analyzed in terms of objects and nature of communication between them. When a program is executed, objects interact with each other by sending messages. Different objects can also interact with each other without knowing the details of their data or code.

Classes

A class is a collection of objects of similar type. Once a class is defined, any number of objects can be created which belong to that class.

Data Abstraction and Encapsulation

Abstraction refers to the act of representing essential features without including the background details or explanations. Classes use the concept of abstraction and are defined as a list of abstract attributes.

Storing data and functions in a single unit (class) is encapsulation. Data cannot be accessible to the outside world and only those functions which are stored in the class can access it.

Inheritance

Inheritance is the process by which objects can acquire the properties of objects of other class. In OOP, inheritance provides reusability, like, adding additional features to an existing class without modifying it. This is achieved by deriving a new class from the existing one. The new class will have combined features of both the classes.

Polymorphism

Polymorphism means the ability to take more than one form. An operation may exhibit different behaviors in different instances. The behavior depends on the data types used in the operation. Polymorphism is extensively used in implementing Inheritance.

Advantages of OOP

Object-Oriented Programming has the following advantages over conventional approaches:

  • OOP provides a clear modular structure for programs which makes it good for defining abstract datatypes where implementation details are hidden and the unit has a clearly defined interface.
  • OOP makes it easy to maintain and modify existing code as new objects can be created with small differences to existing ones.
  • OOP provides a good framework for code libraries where supplied software components can be easily adapted and modified by the programmer. This is particularly useful for developing graphical user interfaces.
Resources:
http://en.wikipedia.org/wiki/Object-oriented_programming
http://www.startvbdotnet.com/oop/default.aspx

Saturday, August 11, 2007

What are the prospects of speech and language?

What are the future and prospects of speech and language?

Real-time Speech translation (which includes speech recognition, machine translation and text-to-speech systems) is considered to be done of the 5 innovations that will have major impact and changes lives in next 5 years.

Here is the IBM views on top 5 innovation impacting the near future:
http://www.ibm.com/ibm/ideasfromibm/us/five_in_five/010807/index.shtml

Real-time speech translation -- once a vision only in sci-fi -- will become the norm: The movement towards globalisation needs to take into account basic human elements, such as differences in language.

For example, IBM speech innovations are already allowing media companies to monitor Chinese and Arabic news broadcasts over the Web in English, travellers using PDAs to translate menus in Japanese, and doctors to communicate with patients in Spanish.

Real-time translation technologies and services will be embedded into mobile phones, hand-held devices and cars. These services will pervade every part of business and society, eliminating the language barrier in the global economy and social interaction.

What is Speech and Language?

What is speech and language processing?

Arthur C. Clarke one of well know fiction authors predicted an artificial agent called HAL. The HAL 9000 computer in Stanley Kubrick's film
2001: A Space Odyssey is one of the most recognizable characters in twentieth-century cinema. HAL is an artificial agent capable of such advanced language-processing behavior as speaking and understanding English, and at a crucial moment in the plot, even reading lips.

What would it take to create at least the language-related parts of HAL? Minimally, such an agent would have to be capable of interacting with humans via language, which includes understanding humans via speech recognition and natural language understanding (and, of course, lip-reading), and of communicating with humans via natural language generation and speech synthesis.

HAL would also need to be able to do
information retrieval (finding out where needed textual resources reside), information extraction (extracting pertinent facts from those textual resources), and inference (drawing conclusions based on known facts). Although these problems are far from completely solved, much of the language-related technology that HAL needs is currently being developed, with some of it already available commercially.

Solving these problems,
and others like them, is the main concern of the fields collectively known as
Speech and Language Processing.

Ref: http://www.cs.colorado.edu/%7Emartin/SLP/slp-ch1.pdf

Saturday, May 19, 2007

Questions to be asked by researchers and Enterpreneurs

Following the wise words in http://seanwise.typepad.com/ the key questions for an entrepreneur include:

  1. Why does anyone (end-user/customer) need this?
  2. Why you? Why do you have a competitve advantage in brining this product to market?
  3. Why can't others just copy you?
  4. Why is it novel, different, better, than anything else?
  5. How much cash do you want? What are you going to do with it? And what will the results be?
  6. What's it worth? And why is it worth that much?
Prof. Raj Reddy's Questionnaire for a research proposal/problem is:
  • What are we trying to do?
  • What is the key make-a-difference capability we bring to the table?
  • How is it being done today?, How are others approaching the problem?
  • What are the limitations of current/proposed approaches?
  • What is new in the approach?
  • What is the plan for realizing the goal?
  • if you succeed who will care? What is the impact?
  • How much will it cost and how long will it take?
  • What are the mid-term and final exams?
The use of "why" type of questions for entrepreneurs and use of "what" and "how" type of questions for researchers is interesting. And also the final questions for a researcher seems to be the beginning questions for an entrepreneur.

Saturday, March 17, 2007

Google further into visualization

Since Google acquired visual statistics software Trendalyzer,

Check the interesting demo at:

http://video.google.com/videoplay?docid=7996617766640098677

Friday, March 16, 2007

Ah! Local Languages in UTF-8 format

Yahoo! and MSN have started the news contents in Indian languages, while Google is summarizing it for Hindi (all of them in UTF-8!!)

I cann't explain the joy of reading MSN Telugu, it is so good to read news in your native language specifically international and information technology news items which are rarely covered by local papers.

The other big advantage is this data can be used to build language models which turn to help to build better speech synthesis and speech recognition, machine translation systems for Indian languages.

MSN:

www.msn.co.in/hindi
www.msn.co.in/tamil
www.msn.co.in/telugu
www.msn.co.in/kannada
www.msn.co.in/malayalam

Google:

http://news.google.com/news?ned=hi_in

Yahoo!

http://in.telugu.yahoo.com/
http://in.hindi.yahoo.com/
http://in.gujarati.yahoo.com/
http://in.punjabi.yahoo.com/
http://in.kannada.yahoo.com/
http://in.malayalam.yahoo.com/
http://in.tamil.yahoo.com/

BBC:
http://www.bbc.co.uk/hindi/
http://www.bbc.co.uk/tamil/
http://www.bbc.co.uk/bengali/
http://www.bbc.co.uk/urdu/

Sunday, March 11, 2007

Will IITs remain centers for excellence?

Bhamy V Shenoy an alumnus of IIT Madras has an interesting view point conveying the effect of coaching centers on IITs being the center for excellence. Read more at http://in.news.yahoo.com/070311/43/6d55o.html