Question:
Any easy and faster way to analyze the source codes of an open source project?
cshong1987
2011-02-27 22:49:26 UTC
As a computer science degree student, time is very important. And, time allocated for each assignment and work are very limited. If you ever being student, you will understand.

There is one of the assignment, which I need to choose an open source distributed file system to enhance it. I had choosen GlusterFS.

There are many source files in GlusterFS. Most of the files consist of very long source codes. There are no comments and explanation in the source files.

Since GlusterFS is a server application, I want to find out which part of the source codes accept the connection from client. However, due to the complexity of the source codes, I need to spend huge amount of time to find out the part of the source codes that accept connection from client.

I am assuming that when I found which part of the source codes accept client connection, my semester may be ended.

So, is there easy and faster way to analyze the source codes of an open source project and find out the part of the source codes whch implement a specified features?

Or, can you give me some suggestions or comments?

And, I hope that the solution will not require me to contact the developers.
Three answers:
Light Cloud
2011-02-27 23:08:06 UTC
Sadly, it's a fact of life that even when source code is available, if the code isn't well structured (insufficient/outdated comments, lack of structure, etc), it might be difficult to trace the flow of execution.



I'm not familiar with GlusterFS, although I've attempted to read other people's source code before. It's not always easy, but I get the feeling it's one of those things that we as CS people have to deal with (even though I wish other poeple's code were better commented).



Although I definitely can sympathize with your pain and frustration, I'd recommend trying:

-Use a debugger, if possible. Set breakpoints liberally, and you can figure out what functions are called for specific features when the breakpoint is hit while you're using that feature.

-Without an adequate debugger, you could substitute print/debug statements to trace which functions are called.

-Consult the API and/or documentation, if they exist (hopefully...).

-If you're familiar with the language enough to know what kinds of functions must be used, you could do a global search across all files for those keywords. For instance, connecting to a client might involve using sockets, so searching for socket-related keywords (like "connect, bind, port, etc") might yield the correct files.

-If the project has forums and/or mailing lists, you could search there for hints.



Unfortunately, it's easier to write code than it is to read someone else's code, and poorly written/documented code is incredibly hard to trace. Just going to have to deal with it though :(
anonymous
2011-02-27 22:53:25 UTC
What you ask is impossible. If the code is complicated, has no comments and is using stuff you aren't all the familiar with, it would take a lot of work no matter what you do to understand it. Even contacting the developers wouldn't do you any good, because it's not like they are going to explain everything to you randomly, especially since they didn't even bother to comment the code.



Maybe you should consider a different open source project?
?
2011-02-27 23:53:03 UTC
For network connections, look for "socket". E.g. in the client, there's code in glusterfs/contrib/fuse-lib/mount.c



There's no reason not to contact the developers - at least, to subscribe to the mailing list and post questions.



It should not be hard to find the connection code. It will be harder to make some significant meaningful contribution to the project. You could go through the buglists and try to fix some issues, e.g. there's an IPv6 enhancement listed in bugtraq


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...