Monday, November 22, 2010

java stuff...

java blocking queue...
producer and consumer..

java threads...java concurrent
java.util.concurrent.atomic

java cubbyhole

java collections
java pattern matching
java map
hash map

java hotspot

scala programming

Wednesday, November 17, 2010

Monday, November 15, 2010

nlp project

Polarity Detection of movie reviews
Introduction :-
Today, very large amounts of information are available in on-line documents. As part of the effort to better organize this information for users, researchers have been actively investigating the problem of automatic text categorization. In Polarity detection in movie reviews, The basic problem is to classify a movie review as positive or negative. Some training data set is provided and we have to develop the best technique to classify them into their respective classes. This would be very useful in knowing the defects in a product. If a customer had decided to purchase a product and is willing to know only the defects of a product rather than concentrating on parts which emphasize it. This classification process would be very
helpful. Similarly we have many other scenarios like ad generation which would depend on polarity detection.

Proposed approach :-
Data Set :- Data set used for this project would be 1400 movie reviews classified for
their polarity.
Preprocessing :- The given data may have some irrelevant and noisy things. Which can be eliminated by applying some techniques. These application of techniques would increase the accuracy.
Part of Speech Tagging: Tagging Each words in the data set with its part of speech using Some standards like penntree bank or some other methods. These different combinations of part of speech tags, those more related to describing a sentiment for our further operation. as from general observation it is noted that Adjectives are the good features for polartiy detection.
Stop words: Also known as noise words, are words which are of not much significance in context of categorization and sentiment analysis. Eg.: articles, conjunctions, etc. We need to remove all the stop words from our data set and produced a new data set.

Stemming: Stemming is the process of changing a words to its root or base form, by removing inflexions ending from words in English.( porter stemmer algorithm) There are also few other preprocessing techniques like quote-weighting, html-tag filter and non-English review filter. which can be applied depending on the data present. As from previous experimental evaluation, it was discovered that the frequency and presence of a particular feature for polarity detection dint had much effect. So, I am not yet decided but thinking to consider only the presence of a particular feature.

I am thinking to use the following lexical features to decide the polarity
Combination of a word and its part-of-speech tag
Bigram words

Unigram words

considering the words before adjective as collocation... because we have some word with negation before like agree and not agree....

part-of-speech tag
I am still working on using some other lexical features. I would inform them as the project progress.

Naive bayes :-
One approach to text classification is to assign to agiven document
d the class c∗ = arg maxc P (c | d).We derive the Naive Bayes (NB) classifier by first
observing that by Bayes’ rule,
P (c | d) = P (c)P (d | c) / P (d)
where P (d) plays no role in selecting c∗ . To estimate the term P (d | c), Naive Bayes
decomposes it by assuming the fi ’s are conditionally independent.

pl project...

server sockets in java ---> ssl sockets....
file handling
threads
runtime java

cloud apps

live mesh
drop box

Saturday, November 13, 2010

learn programming by having challenges,

ruby quiz

projet euler

pragmatic programmer --> book

duby-talk mailing list

perlmonks.

hadoop file system..

hadoop is platform for performing distributed computing...


Hadoop is currently aimed at “big data” problems (say, processing Census Bureau data). The nice thing about it is that a Hadoop cluster scales out easily, and there are a number of providers who will let you add and remove instances from a Hadoop cluster as your needs change to save you money. It is the kind of system that lends itself perfectly to cloud computing, although you could definitely have a Hadoop cluster in-house.

While the focus is on number crunching, I think that Hadoop can easily be used in any situation where a massively parallelized architecture is needed.

Wednesday, November 10, 2010

error in creating session..

org.hibernate.HibernateException: Could not parse configuration: /hibernate.cfg.xml
at org.hibernate.cfg.Configuration.doConfigure(Configuration.java:1494)
at org.hibernate.cfg.Configuration.configure(Configuration.java:1428)
at org.hibernate.cfg.Configuration.configure(Configuration.java:1414)
at com.mycompany.app.db.HibernateSessionFactory.(HibernateSessionFactory.java:30)
at com.mycompany.entity.Programs.getProgramsWaitingForSubmission2(Programs.java:33)
at com.mycompany.app.job.JobManager.runJobSubmissions(JobManager.java:744)
at com.mycompany.app.clients.JobClient2.runJobs(JobClient2.java:102)
at com.mycompany.app.clients.JobClient2.access$000(JobClient2.java:18)
at com.mycompany.app.clients.JobClient2$4.run(JobClient2.java:87)
Caused by: org.dom4j.DocumentException: Error on line 2 of document : The processing instruction target matching "[xX][mM][lL]" is not allowed. Nested exception: The processing instruction target matching "[xX][mM][lL]" is not allowed.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.hibernate.cfg.Configuration.doConfigure(Configuration.java:1484)
... 8 more
%%%% Error Creating SessionFactory %%%%
org.hibernate.HibernateException: /var/www/site31/swarmapp/src/main/resources/hibernate.cfg.xml not found
at org.hibernate.util.ConfigHelper.getResourceAsStream(ConfigHelper.java:147)
at org.hibernate.cfg.Configuration.getConfigurationInputStream(Configuration.java:1405)
at org.hibernate.cfg.Configuration.configure(Configuration.java:1427)
at com.mycompany.app.db.HibernateSessionFactory.rebuildSessionFactory(HibernateSessionFactory.java:94)
at com.mycompany.app.db.HibernateSessionFactory.getSession(HibernateSessionFactory.java:72)
at com.mycompany.entity.Programs.getProgramsWaitingForSubmission2(Programs.java:33)
at com.mycompany.app.job.JobManager.runJobSubmissions(JobManager.java:744)
at com.mycompany.app.clients.JobClient2.runJobs(JobClient2.java:102)
at com.mycompany.app.clients.JobClient2.access$000(JobClient2.java:18)
at com.mycompany.app.clients.JobClient2$4.run(JobClient2.java:87)

no path for sending files into teragrid..

condorjobmultisites.java --->swarmapp/src/main/java/com/myco/app/job

there is no path for either input or output files....

left out with one dependency..

installed all the java dependencies..

left out with the swarm jar file

3) Swarm:Swarm:jar:0.9

Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=Swarm -DartifactId=Swarm -Dversion=0.9 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=Swarm -DartifactId=Swarm -Dversion=0.9 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) swarmapp:swarmapp:jar:0.0.1-SNAPSHOT
2) Swarm:Swarm:jar:0.9

Tuesday, November 9, 2010

got struck with installing biojava..

i installed it using the Readme file for the swarmapp but actually it need to be installed using maven...

atlast the following command worked out

mvn install:install-file -DgroupId=biojava -DartifactId=biojava -Dversion=1.6.1 -Dpackaging=jar -Dfile=/path/to/file

Monday, November 8, 2010

workit out !!!!!!

Missing:
----------
1) biojava:biojava:jar:1.6.1

Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=biojava -DartifactId=biojava -Dversion=1.6.1 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=biojava -DartifactId=biojava -Dversion=1.6.1 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) swarmapp:swarmapp:jar:0.0.1-SNAPSHOT
2) biojava:biojava:jar:1.6.1

2) javax.sql:jdbc-stdext:jar:2.0

Try downloading the file manually from:
http://java.sun.com/products/jdbc/download.html

Then, install it using the command:
mvn install:install-file -DgroupId=javax.sql -DartifactId=jdbc-stdext -Dversion=2.0 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=javax.sql -DartifactId=jdbc-stdext -Dversion=2.0 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) swarmapp:swarmapp:jar:0.0.1-SNAPSHOT
2) javax.sql:jdbc-stdext:jar:2.0

3) Swarm:Swarm:jar:0.9

Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=Swarm -DartifactId=Swarm -Dversion=0.9 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=Swarm -DartifactId=Swarm -Dversion=0.9 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) swarmapp:swarmapp:jar:0.0.1-SNAPSHOT
2) Swarm:Swarm:jar:0.9

4) javax.security:jaas:jar:1.0.01

Try downloading the file manually from:
http://java.sun.com/products/jaas/index-10.html

Then, install it using the command:
mvn install:install-file -DgroupId=javax.security -DartifactId=jaas -Dversion=1.0.01 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=javax.security -DartifactId=jaas -Dversion=1.0.01 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

Path to dependency:
1) swarmapp:swarmapp:jar:0.0.1-SNAPSHOT
2) javax.security:jaas:jar:1.0.01

----------
4 required artifacts are missing.

for artifact:
swarmapp:swarmapp:jar:0.0.1-SNAPSHOT

from the specified remote repositories:
central (http://repo1.maven.org/maven2)



[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5 seconds
[INFO] Finished at: Mon Nov 08 10:01:07 CST 2010
[INFO] Final Memory: 11M/26M
[INFO] ------------------------------------------------------------------------

Thursday, November 4, 2010

i had forgot to install the swarmapp.. which i got through source code.....
this is done by using maven2
>>mvn install in swarmapp..

Monday, November 1, 2010