Friday, November 27, 2009

Week 12 Readings

I liked this weeks articles, and they were on a topic I'm interested in so that was even better. I particularly liked how it wasn't just a bunch of articles on tagging or folksonomy in general, rather articles on different aspects of folksonomy and what it can be used for (i.e. library instruction, academic library, etc.) I did think the articles were pretty basic though, almost bordering on to basic. The chosen articles could have used bit more depth but it was good that they were on things relating to libraries though.

1.) Allan, "Using a Wiki"

This article was about how libraries can use a wiki to make library instruction better for sharing information, facilitating collaboration in the creation of resources, and efficiently dividing up work loads among different librarians. A wiki is kind of like a word document where you can edit text and attach files.

This article focuses mainly on how wikis can be used to help in library instruction, whether from the library itself or in a particular class with the help of a professor. Library instruction wikis have to main uses, sharing of knowledge and ability and to cooperate in creating resources. Wikis are extremely easy to use and free to create. Once you create one you can invite other users to participate in it and they can change the wiki. Wikis are beginning to catch on in many different workplaces.

This was an interesting article, yet somewhat basic. I also think that there are many more useful way to use a wiki than in library instruction, yet it would be helpful in this also.

2.)Arch, "Creating the Academic Library Folksonomy"

This article is about social tagging and the advantages to it in libraries. Social tagging is a new phenomenon which allows people to create tags for websites and store them online. This could be very useful to libraries. It could help them better help their users in their research goals and needs. The article then gives some examples of sites that use tagging like delicious.

I had a fairly big problem with this article, especially since it was written in 2007 (fairly recently). The article makes it seem like the only thing tagging is good for is a glorified bookmarking system. It talks about how you can save and tag a website and then retrieve it later on a different computer, much like just bookmarking it to a server so it can be found on other computers. Social tagging compatibilities go way beyond this; this is only a small part of the advantages of being able to tag things. Tagging allows things to be stored and organized in ways the physical world was never able to. With tagging you can organize a book under a lot of different categories instead of it having to be on a specific spot on a shelf. The article touched on the advantages of tagging but was way to narrow and did not begin to show the scope of what is possible with tagging.

3.)Wales "How a ragtag band created Wikipedia"

This was an interesting video, much like the Google one we watched earlier in the semester, as it explained how Wikipedia was formed, its goals, and what it currently is working on. Wikipedia is a free encyclopedia written by thousands of volunteers. The goal of the organization is to provide access to as much information as possible to as many people as possible.

The one thing I liked most about the video is that it addressed the main issues/controversies/myths people have about Wikipedia. The main issue being that many people believe that because anybody can contribute and change a Wikipedia article that it is not reliable, especially because people can edit articles anonymously. The creator says that this is not as big of a problem as people think. The software Wikipedia uses is open ended where everything is left up to volunteers, because of this people police themselves and other users, instead of just creating false articles. Wikipedia maintains a neutral point of view, and again because many people may be working on the same article this is not as hard to do as people may think (even with political issues they can maintain a neutral point of view). The one thing I thought was very interesting is what the creator called the "Google Test" meaning that if the topic of an article does not show up in a google search then it is probably not worthwhile enough to have an encyclopedia article about.

Overall, I liked this video, especially because it gave me some background on a website that I use almost daily.

Week 11 Muddiest Point

When doing link analysis why don't they look at the outgoing links of a website also, instead of just incoming links. It seems like outgoing links would also be a helpful way to analyze a website, instead of just limiting it to incoming links.

Monday, November 23, 2009

Friday, November 20, 2009

Week 11 Readings

So it seems I have been way out of order in doing the blogs and readings. I remember Dr. He saying that the order of content was going to be switched, but when I went to do the readings I completely forgot. I did the readings by the date of them in courseweb and not the order I was supposed to. I hope this won't negatively effect my grade as I have done all of them the past 3 weeks just in the wrong order. It seems that I should be back on track for week 12 and it was only the last 3 weeks that were out of order.

This weeks readings were interesting and on a topic I enjoy. It was a little different since I am out of order and we have already discussed this topic in class, but I still thought the readings were good to read, even if they were late.

1.) Mischo, Digital Libraries: Challenges and Influential Work

E
ffective search and discovery over open and hidden digital resources is still problematic and challenging. There are differences between providing digital access to collections and actually providing digital library services. This is a very good point, and I liked it a lot. Just simply providing access to a lot of digital collections does not mean you are providing digital library services.

The first significant federal investment into digital library research was in 1994. There has recently been a surge in interest in metasearch or federated search by many different people and institutions. The majority of the rest of this article discussed previous and current research being done in digital libraries and the instiutions doing them.

2.)Paepcke, et al. Dewey Meets Turing

In 1994 the NSF launched the Digital Library Initiative (DLI), which united libraries and computer scientists together to work on the project. The invention and growth of the World Wide Web changed many of their initial ideas. The web very instantly blurred the distinction between the consumers and the producers of information.

One point in the article I found very interesting was the fact that the computer scientists didn't like all the restictions placed upon them by the publishers. They were not allowed to make public all of their work, becuase that would then make public all of the materials in it (i.e. the publishers copyrighted material). This is interesting because it showed the light of all the digital copyright laws to people that may not have understood all the restrictions.

3.) Lynch Institutional Repositories

Institutional repositories are a new strategy that allows universitites to accelerate changes in scholarship and scholarly communication. The author defines instituional repositories as a set of services a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. This includes preservation of materials when needed, organization, and access or distribution. He thinks a good IR should contain materials from both faculty and students and both research and training materials.

Universities are not doing a good job facilitating new forms of scholarly communication. Faculty are better at creating ideas, not being system administrators and dissemenators of their works. IR's could solve these problems. They address both questions of short term access and long term preservation, and have the advantage of being able to maintain data as well.

The author sees some problems where IR's can go astray or become counterproductive: if the IR is made for administration to exercise control over what had been a faculty controlled work; one can't overload the infrastructure with distracting and irrelevant policies; and don't implelment the IR's hasitly. Just because other universities are implementing new IR's doesn't mean you should rush into it and start one yourself. These are points the author believes universities need to look closely at when implementing institutional repositories.

IR's promote progress in infrastructure standards in many different ways, the author gives three examples. Preservable formats - the things in the IR should be preserved, different institutions will do this in different ways though. Identifiers - reference materials in IR's will be important in scholarly dialogue and record. Rights Documents and Management - management of rights for digital materials will be essential. You need a way to document rights and permissions of the works.

I liked the article on IR's the best. I'm interested in the many advantages of institutional repositories and how to best implement them and this article was extremely informative

Week 10 Muddiest Point

I understand that XML is a general markup language and can be used for many other things besides building webpages. Why is XML preferred for building webpages specifically though? It seems that HTML is much less complicated and hence less time consuming to use. XML is much more in depth and takes a lot more effort to produce a webpage. Maybe I'm just more familiar with writing in HTML as opposed to XML but it seems XML is more complicated and takes a lot more effort to produce a webpage.

Saturday, November 14, 2009

Week 10 Readings

These readings were on a topic that I'm not overly familiar with, so I learned a lot. I knew of the basics behind search engines, but not that many details on how they actually worked.

1.)Hawking - Web Search Engines

Part 1.
Search engines cannot and should not attempt to index every page on the world wide web. To be cost-effective they must reject low value automated content. Search engines require large scale replication to handle the large input. Currently search engines crawl/index approximately 400 Terabytes of data. Crawling at 10GB/s it would take 10 days to do a full crawl of this much information. Crawlers use "seeds" to begin their search, and then look through the link for URL's they currently haven't indexed.

Crawlers must address many issues. Speed - it would take too long for each crawler to individually crawl each website so they only crawl the URL's in which they are responsible for. Politeness - only one crawler goes to a URL at a time so it doesn't overload the sites servers. Exclude Content - they must respect the robot file that authorizes whether or not they can crawl the website. Duplicate Content - it is simple for a search engine to see if two websites have the same text but they must make distinctions between different URL's, dates, etc., they need more sophisticated measures to do this. Continous Crawling - How often a webiste is crawled/indexed is determined on numerous different factors, not just stactic measures. Spam Regulation - websites can artifically inflate their status by adding links to them so crawlers will see them better.

Part 2.
An indexer creates an inverted file in two phases - scanning and inversion. The internets vocabulary is very large - in contains documents in all languages as well as made up words such as acronyms, trademarks, proper names, etc. Search engines can use compression to reduce demands on disk space and memory. Link popularity score says that search engines assign this from the frequency of incoming links.

The most common query to a search engine is a small number of words without operators. By default search engines return only documents containing all the query words. A results quality can be improved if processors scan the end of lists, then sorts the long list according to a relevance scanning function. MSN Search reportedly takes into account over 300 ranking factors when sorthing their lists.

Search engines also have techniques for speeding up their searches and how fast their results are displayed. Search engines often skip hundreds or thousands of documetns to get to the required documents. They also sotp processing after scanning only a small fraction of the lists. Search engines cache their searches, which precomputes and stores pages for thousands of the most popular websites.

2.)Shreeves - OAI Protocol

The Open Archives Initiative Protocol for Metadata Harvesting has been widely adopted since its initial release in 2001. There are now over 300 active data providers. There mission is "to develop and promote interoperability standards that aim to facilitate the effect and dissemination of content." OAI is divided into repositories, that make the metadata availiable, and harvesters, that harvest the metadata. No one provider can service the entire needs of the public, so specific ones have popped up.

Many OAI registries suffer from a number of shortcomings, typically no search engines, limited browsing, and the fact that they are incomplete. The UIUC research group is trying to address these problems. They are trying to enhance the collection level description to enable better search functions. OAI also has some challenges they are currently dealing with: the metadata, problems with data provider implementations, and lack of communication between service and data providers.

3.) Bergman - Deep Web

This article as a tad bit to technical/scientifical for me, but was still an extremely good article. I didn't necessairly enjoy (or understand) all the technical data and scientific research information in this article, but I really liked the concept and theories behind it. The main point of this article was something I had never really explored or read about in depth. Overall, this article was one of my favorite reads of the semester, purely for the concepts it talked about.

The articles main concept is that most of the webst information is bured deep where search engines are unable to find it. The deep web is very different than the surface web. The deep web consists a lot of searchable databases, that are searched "one at a time," as opposed to searchable (search engines searchable) websites. BrightPlanet tried to quantify the deep web and their statistics actually blew me away. I knew that there was a lot of information that were "hidden" within in the web and Internet but I did not realize the extent of it. The deep web is 400 to 550 times larger than the surface web (which equates to about 7500 TB as opposed to 19 TB). The deep web has about 550 billion documents as opposed to the surface web's 1 billion. The deep web is also about 95% accessible to the public for free.

The internet is more diversified than realized, and the deep web is the major reason for that. The web is really only a portion of the Internet. Deep web covers a broad and releavant amount of topics. The information possesed in the deep web is of very high quality and is growing much faster than the surface web.

Week 9 Muddiest Point

I actually had no muddiest point this week.

Saturday, November 7, 2009

Assignment #5

Assignment #5

Koha Assignment - The books I chose are on something dealing with Pittsburgh

Wednesday, November 4, 2009

Week 9 Readings

I have absolutely no experience or knowledge of XML so these articles were very helpful to me, though sometimes a little too technical for me to understand. I seem to have a much better understanding of how to use XML, but am still unclear on a few things that will hopefully be talked about in class.

1.)Bryan, "Introducing the Extensible Markup Language (XML)"

XML is a subset of the Standard Generalized Markup Language (SGML), and is designed to make it easy to interchange documents over the Internet. With XML you must always clearly defined your start and end, as opposed to HTML where it is sometimes acceptable to not close your end tag. XML Document Type Definition (DTD) can be used to check to make sure that components of the XML document appear in a valid place.

XML is based on a concept of documents composed of a series of entities (things or objects); each entity can contain one or more logical elements, which can have attributes. One of the interesting things about XML is the way it incorporates special characters into the code, I liked how it did this.

An XML file has three types of markups - the first two are optional. First, the XML processing instruction, which identifies the version of XML being used. Second, the document type declaration. Lastly, the fully tagged document instance. If all three are present then the document is considered "valid." If only the last one is present then the document is "well-informed." XML is ideal for using in databases.

2.) Uche, "A Survey of XML Standards"

I liked how this article was setup and that it provided you with a brief overview of many diffrent parts of XML and what they can do - not just technical information on how to use/write XML

XML has been widely translated to different languages, but English is still the standard. There was some controversy when XML 1.1 came out. This new version had only very small changes to it and people wondered whether a new version was really necessary, especially because there was a good chance interoperability issues would arise.

XML catalogs define the format for instructions on how the processor resolves the entities into actual documents. XML namespaces proved a mechanism for naming elements and attributes. Xinclude is still being developed, but it will provide a system to merge different XML documents. This is usually used to split large documents into manegable chunks, then merged back together again. Xpath can be used to locate certain elements in a document. Xlink is a generic framework for expressing link sin XML.

3.) Bergholz, "Extending Your Markup"

I really liked the figures in this article, they provided good examples of XML. The examples were complex enough that you got a decent idea of how to write XML, but not so complicated that you didn't understand what they were trying to portray.

XML is all about meaningful annotations. DTD's define the structure of XML documents; they specify a set of tags, order of tags, and attributes associated with tags. XML elements can be either terminal or nonterminal. Nonterminal elements contain subelements which can be grouped as sequences or choices. XML attributes use the !ATTLIST tag.

XML extensions include namespaces and addressing and linking abilities. Unlike HTML it is not necessary to use an anchor in XML, and extended links can connect multiple documents together. Namespaces avoid name clashes. Extensible Stylesheet Language (XSL) can allow you to transform XML into HTML. XML schema allows the user to define datatype.

4. W3 Schools Tutorial - XML Schema

Much like the HTML tutorials the XML tutorial was very helpful; I really like the W3 school tutorials, and their website. I wish this website would have given more technical information on how to actually write XML schemas (like it did with HTML) as opposed to more theory, or what XML schema can do.

XML schema describes the structure of an XML document and can be used as an alternative to DTD's, as they are much more powerful. XML schema defines elements, attributes, order of elements, and many other things. It supports datatypes and is written in XML, which has many advantages to it. A simple element contains only text, and cannot have attributes. If a element has an attribute it is considered complex. Restrictions can be used to define acceptable values for elements and attributes.

A complex element contains other elements and/or attributes. There are four kind of complex elements. Empty complex elements - they can't have contents, only attributes. Elements only - contains an element that contains other elements. Text only - can contain text and attributes. Mixed - can contain attributes, elements, and text. Indicators control how elements are to be used. String data types are used for values that contain strings.

After reading the articles I have a much better understanding of XML. One thing I'm still confused about though are the advantages of XML over other markup languages, specifically HTML. A lot of the articles stated that XML was better and gave theoretical reasons why it was better. From looking at the many examples of things being written in XML it looks enourmously more complicated than HTML though. Something can be written in HTML in a few lines looks like it takes a ton of lines in XML. I know XML is suppossed to be better than HTML it just looks way more complicated and time consuming.

Week 8 Muddiest Point

I understand how to make a cascading style sheet and the purpose of it. How do you apply the style sheet to the document you are writing in HTML? I know with an external style sheet you provide a link for it when you write your HTML document, but if you have what you have written in HTML and what your cascading style sheet - how do you "combine" the two?

Thursday, October 22, 2009

Week 8 Readings

1.) W3Schools HTML Tutorial

I'm pretty experienced in writing HTML, so there was not much new in this article for me. It was a good refresher for me though, since I haven't written in HTML in a few years and had forgotten some of the basics to it. For somebody inexperienced in HTML though this would be an amazingly informative tutorial. The tutorial provides a beginner with everything they need to make a simple web page - and with a little practice even an advanced one.

HTML stands for hypertext markup language; it is not a programming language. HTML tag keywords are surrounded by brackets. Most browsers will recognize HTML even without an end tag, but future browsers will not do this. HTML attributes provide additional information. Text formatting allows HTML to change the format of the text, i.e. bold, italics, etc.. One thing I didn't know about HTML was how/or that you could make forms, it was interesting and surprisingly easy. The newest version of HTML is HTML 4.0 - it separates presentation from the document structure. You now use style sheets to change the presentation of the web page, which are much easier and faster.

2.) HTML Cheatsheet

This was a good reference for somebody writing in HTML. This would be especially good for somebody that has a good handle on HTML but is still beginner enough to not remember all the tags. It would also be good for somebody in my position who used HTML a lot in the past, but hasn't used it in a while - as a good refresher for tags.

3.) W3Schools Cascading Style Sheet

This tutorial was very interesting and informative for me, as I have never used a style sheet before. Even though I have used HTML a lot; I always included tags for format and style into the HTML and never used a style sheet. I really liked the tutorial, and is definitely something I would refer to if I ever was to use HTML again.

CSS stands for Cascading Style Sheet, which defines the display of HTML elements. CSS's purpose is to control style and layout - and can be used to control style for many web pages all at once. HTML never intended to contain tags for formatting. CSS syntax is made up of 3 parts: selector, the HTML element or tag you want to define; property, attribute you wish to change; and property value. When a browser reads a style sheet it will automatically format the document according to the style sheet. There are three ways of inserting a style sheet: external, when applied to many pages; internal, applied to a single document; and inline, where you lose advantages of style sheet by mixing content and presentation. There are two different font types: generic family, which all have a similar look; and font family, which are specific fonts.

4.)Goans - Beyond HTML Article

This article is about the new Georgia State University library website. Their previous website was inconsistent, and had to many people doing to many different things (many with no website experience), and had no security. GSU decided to go with a database driven system, which is more flexible and efficient.

Content Management (CM) is a process of collecting, managing, and publishing content. Content Management Systems (CMS) is content disconnected from layout and design. You do not need to know HTML to run their CMS. This was very helpful to GSU, as most of their users were not very familiar with HMTL. Lots of different people have access as content creators yet their CMS still acts as a limited gatekeeper. This system allows the creators the freedom to tag, and organize the content as they want. The CMS system allowed the GSU library to do many more things that their previous file/folder system couldn't do.

GSU chose to go with a CMS system. They looked at other types of systems before they chose this. They looked at programs like Dreamweaver and other HTML programs. They couldn't justify doing this because their current software was free through the university and the library would have still had the same problem. They also couldn't justify spending money on a commercial CMS, because they had other free software. They looked at open source CMS, the problem with this is what the library needed was to big. They would have had to piece together a bunch of different open source software to make it work. They ultimately ended up building their own system.

The heart of the CMS technology that GSU built is a MySqul database on a Windows server. It was made up of: resource tables, which stores content; metadata tables, which assign content to templates; and personal metadata tables, which houses data for logging in and contact info. The CMS is basically a digital repository for a digital library system.

Week 7 Muddiest Point

Can anybody have access to Internet 2, or is it closed purely for researchers and universities? Is the main goal of Internet 2 to test standards and different things before they are introduced into the mainstream internet?

Monday, October 12, 2009

Week 7 Comments

http://mdelielis2600response.blogspot.com/2009/10/week-7-reading-responses.html?showComment=1255404581125#c9144006567009363903

http://lis2600infotechnology.blogspot.com/2009/10/jing-assignment.html?showComment=1255402991268#c4371057323831862153

Assignment #4 - Jing

Assignment #4 - Jing

I did my Jing assignment on how to send and open an E-mail attachment.

Screencast Video

Flickr photos - as a set with all 5 photos

And individually, photos 1, 2, 3, 4, and 5


Wednesday, October 7, 2009

Week 7 Readings

1.) Tyson - Internet Infrastructure

This was a good article but it seems that it may have been more useful to read it last week. It didn't really repeat anything from the articles last week but it did repeat what was said in class. The information in this article was already talked about by Dr. He in class last Tuesday.

The article talks about networks and the internet infrastructure. Nobody technically owns the internet - it is just a global connection of networks. Many different ISPs are interconnected at NAPs (network access points) and they agree to communicate with each other. Networks rely on NAPs, backbones, and routers to communicate with each other. Routers determine where to send the information from computer to computer. Backbones are fiberoptic trunk lines.

IP (internet protocol) addresses are a unique identifying number for every machine on the internet. The numbers in IP addresses are called octets and there are a total of 32 numbers. DNS's convert the IP address into human readable words. Caching will essentially "save" the IP address from the DNS root server so that it doesn't have to contact the root server each time.

2.) Pace - Dismantling ILS

I'm not exactly sure what the argument in this article was. It was talking about ILS and problems with interoperability in regards to ILS, but I'm not sure what the author was trying to convey exactly.

ILS was essentially changed in the early 90's when libraries began embracing the web. Interoperability in library systems is a problem. Many people believe that the technology in libraries is interoperable, but it really isn't. Current ILS's are not doing the job libraries need them to, and libraries often have to find their own solutions or buy standalone products. Some libraries are exploring open source software to solve some of their problems - but these also create interoperability problems.

I think the article was implying that vendors need to be making a better product with better interoperability between not only their own products but other brands products as well, but I'm not really sure if that was the gist of it or not.

3.)Inside the Google Machine Video.

This was an interesting video, and cool to hear about Google as a company from the founders. I'm not really sure what the relevance of it was to our class though. It was intersting and informative, but didn't really have that much to do with information technology as far as I could see.

The most intersting part of the video in my opinion was the 20% rule. Google alows their employees to spend 20% of their time doing and working things they feel are useful or important. This is a really cool and interesting idea and they say it is how they keep new and innovative ideas coming. This is an idea that other companies may want to try to implement in one way or another. I know all companies can't afford to do something like this, or may be in an industry where it wouldn't be advantageous too, but for somebody like Google this is a great idea to keep the creative juices flowing in their employees.

Week 6 Muddiest Point

What are the advantages of a ring network? It seems that bus and star networks are much more robust and useful, so why would anybody set up a ring network?

Saturday, October 3, 2009

Week 6 Comments

http://brandonlocke.blogspot.com/2009/10/readings-week-five.html?showComment=1254625692755#c2429541382742443838

http://djd2600it.blogspot.com/2009/10/weekend-update-comments-take-6.html?showComment=1254626172628#c3986758631378457901

Week 6 Readings

1.) Wikipedia - LAN's

A good, but brief, article on LAN's. LAN's (Local Area Network) are computer networks for small physical areas. They usually use either WiFi or Ethernet to connect. They were developed out of necessity by larger universities and research labs for high speed interconnectivity and to share expensive disk space and laser printers.

I've had a good deal of experience working with LAN's. I've setup both wireless and wired LAN's. I've used Cat5 and Cat5e cables as well as wireless b/g to set up LAN's before. They are fairly simplistic to setup if you have a little bit of experience and are very useful for personal home use or to play computer games with your friends.

2.) Wikipedia - Computer Networks

An article with a general overview of computer networks. Computer networks are basically a group of interconnected computers. The connect to each other with either wires (ethernet, coax, etc.) or wirelessly (Wifi, Bluetooth, etc.). There are many different types of computer networks - depending upon what they are going to be used for - personal, local, campus, metropolitan. Wide Area Networks are a broad geographic area and the most popular one is what we call the Internet. You need some specific hardware in order to use computer networks. You will need your computer to have a network card, as well as hubs, switches (a more general term) and routers.

I've had a little bit of experience working with computer networks also; most of it was working specifically with LAN's as explained above. I've also worked with Personal Area Networks and a little bit with Campus Area Networks.

3.) Coyle - Management of RFID in Libraries

I had actually never heard of RFID before reading this article, so the article was very informative to me. After reading the article I realized I have used and been around RFID's for some time, I just didnt' know it. RFID (Radio Frequency Identifier) is similar to a barcode in what it does - but not how it does it. RFID uses electromagnetics as opposed to laser beams. It can also carry a lot more infromation than barcodes. RFID is very good for inventory functions. RFID in libraries could be even more useful than in retailbecause it will not be used as a "throw out." In libraries the RFID would be reused because books leave and then comeback as oppossed to in retail where something is sold and never returns.

RFID could also be useful for security measures in the library. It has a lot of problems in regards to security but it is no worse than what is already in place. It could also cut down on time and money in regards to security because you would not have to install two different systems.

Some problems libraries might experience with RFID is whether or not they would be able to be installed in magazines, pamphlests, sheet music, or other less sturdy items.

RFID is a fairly new technology and seems like it could be used well in libraries. There are a lot of advantages to it that libraries could take care off to save time and money. There are some problems with it though that libraries need to look into before deciding to switch over to it.

Week 5 Muddiest Point

Why are .AVI files still popular? I've never come across a player that is expressly used for a avi extension. Most popular extensions have their only players (Windows Media, Quicktime, RealPlayer, etc.) and even though you can download plugins to play other files in it they are expressly made for one file extension. I know you can play .AVI files in most/all of these players by downloading the right plugins and codecs but with the ease of playing other file extensions in their own players why are .AVI files still so popular. Do they provide something other file extensions don't?

Saturday, September 26, 2009

Week 5 Readings

1.) Wikipedia Article - Data Compression

I really like the fact that we start off each weeks readings with a Wikipedia article - they provide a good overview for what we will read about more in depth later. Again, this article was a good overview though it did not go as in depth as some of the other ones.

Data compression is also known as source coding. It is the process of encoding information using fewer bits. Both the sender and the receiver must understand the coding schema - you must know how to decode it. Data compression is useful because it reduces the consumption of expensive resources (such as hard disk space or bandwidth).

There are two different kinds of data compression - loseless and lossy. Loseless compression uses statistical redundancy to represent data more concisely and without error. It is used more for text-based compression. Lossy data compression is possible if some fidelity loss is acceptable. Lossy compression "rounds off" some of the less important information. This is used more with visual or audio data - where some loss of quality is okay and you still get the same basic idea.

2.)Data Compression Basics

As blackboard said these were long documents but did a good job in covering the basics of many different kinds of data compression. The articles were technical enough, but not so technical that you couldn't understand them. It did a good job of balancing the technical jargon, while still explaining the concepts to you in plain language.

I really liked the articles definition of data compression, it "lets you store more stuff in the same space, and it lets you transfer the stuff in less time, or with less bandwidth." This is a defintion anybody can understand, and does a good job of summing up what data compression is.

This article had way to much information to do a good job of summing it up in this blog, so I'm just going to point out the one thing I found the most interesting/didn't know before reading this article. I didn't realize that moving pictures are acutally just a sequence of individual images. So to compress them you compress each image individually (most likely into a JPEG). To replay them the playback device must be able to decompress the images quickly enough to display them at the required speed.

3.)Galloway Imaging Pittsburgh

This article was very informative and provided a lot of useful real world information for future archivists and librarians. The article was about a 2002 grant Pitt received to provide online access to photographic collections. Over 20 collections from 3 different instituions provided for over 7,000 images being placed online. On the website that the images are at you can do a keyword search, read about the collections and their contents, explore the images by time, place, or theme, and order image reporductions.

What I liked most about the article, and what was most informative, was the real world problems they encountered, and how they dealt with them. Although a lot of the problems they encountered were due to the fact that this project was a collaboration of three institutions the problems and solutions were very informative. Selection will always be a big problem with grants and projects of this nature. They ran into this, especially with split collections. They also ran into metadata challenges because each individual institution wanted a specific kind of metadata for their images. Copyright issues were also a problem, and I liked their solution. They provide a generic copyright warning and also an individual one specific to each image and institution, to covery all their basis.

I took a look at the website that the images are displayed on and it is very well done and has an enormous amount of very informative and cool pictures. I especially liked the sections on maps. Overall, this was a very good article and very informative for future archivists.

Week 4 Muddiest Point

I'm not sure what schema is in general. I understood what was meant by metadata schema but not what schema in general is.

Saturday, September 19, 2009

Week 4 Readins (Addition)

I totally forgot I wanted to add some personal comments at the end of section on Gilliland. I particularly liked this article - it was probably my favorite article we have read so far. There have to be thousands of articles available on metadata and I'm really glad he chose this article that related metadata to libraries and archives so well. This article gave you a really good perspective on metadata in general and how it relates to the field we are all interested in.

Great article and great choice in making us read it!!!

Friday, September 18, 2009

Week 4 Readings

1.) Wikipedia - Database Article

This article provided an overview of databases and their features.

Databases can be either row or column orientated. Database Management Systems is the software that organizes the storage of the data. It controls the creation, maintenance, and use of the data. Most DBMS's are relational and have five components: the interface drivers, SQL Engine, Transactions Engine, Relational Engine, and Storage Engine.

There are many different types of databases, many of them highly specialized. It seemed that an operational database is the one an average person would use on a day to day basis. Databases make use of indexing, which can increase the speed of the database. Indexing allows a set of tables and rows matching a certain criteria to be located quickly. Indexing takes up a lot of storage room, and must be updated consistently, which takes time.

Like all things dealing with computers, security is a big issue for databases. They have three main things to enforce security. 1. Access Control - who can and cannot access the database, and what they can do to it. 2. Auditing Logs - What has been done, when, and by whom. 3. Encryption - data is encoded and then deciphered

Locking is how databases handle multiple, concurrent operations. Only one process at a time can modify the same data. Databases can handle multiple locks at the same time.

Overall, this article was a good general overview of databases.

2.) Gilliland - Introduction to Metadata

Metadata is "data about data" - it is widely used but understood in many different ways by different people. All information objects have three features which can be reflected through metadata. 1. Content - relates to what the object contains or is about - intrinsic to an information object. 2. Context - indicates who, what, where, why, how of an objects creation. 3. Structure - formal set of associations within objects.

Libraries and museums use metada - the information they create to arrange, describe, track, and enhance access to the information. Their goal first and foremost is to provide intellectual and physical access to materials. Larege part of archives and museums use of metadata is on context - preserving context.

The structure of metada is important - it can provide visual cues to researchers. The more structure you have the more searching and manipulating you can do. Metadata is also used in digitization, primarily as descriptors of the context.

The primary functions of metadata are: Creation and reuse - either created digitally or converted into digital format, Administrative/Descriptive - metadata should be added to descriptors especially dependign upon the intended use of the object, Organization and Description - describing and organizign objects in a collection, Validation - prove the authenticity and trustworthiness, Disposition - metadata is key component in documenting disposition of objects.

3.) Miller - Dublin Core.

This article was about DCMI. This article was pretty basic and didn't really say all that much, yet at the same time was very confusing. I'm fairly good with computers and know a lot about them, yet I didn't really have any idea what this article was saying - or more importantly what it was trying to convey to the reader. It basically just listed the main aspects of DCMI in very technical terms and nothing else.

What I think the main points of it were that they realized there would never be a "true" set of semantics so in making DCMI they had to make the ability to mix semantics necessary. You must refine general semantics to say something more specific. The ability to specify a particular encoding system is critical. Things may be written and said in different ways, but they need to be written in the same code.

Standardds are necessary for DCMI to work on a global level. The standards must be as precise as possible.

Overall, I really didn't understand the point of this article (not the point of why we had to read it - I understand why we had to read about DCMI - the actual main point of the article. And even though I have a fairly good understanding of computers and understood most of the technical jargon in teh article it was still confusing. I'm sure most people were probably even more confused than I as. Oh, well - it will be a good thing to learn in class this week at least.

Week 3 Muddiest Point

I really didn't have a muddiest point again this, I pretty much understood everything we talked about in class. I will give my input again on what I thought other people might be confused by though. During one of the breaks I was talking to some people who were confused about command line OS. They seemed to think that everything you did had to be done by command line (ex. the one person thought that to type in a word processor you had to use a command line for everything you typed). It might be easier to explain to them that modern day command line OS are just like DOS that we all used to use. You only need to type in a command to say open a program - then the program is used just like it would be on a GUI OS. Also, that a lot of modern command line OS's are similar to GUI OS's in that you can now point and click to perform basic tasks.

Tuesday, September 8, 2009

Week 3 Readings

1.) Thurott, "An update on the Windows Roadmap"

This article provided an overview of the future of three different Windows operating Systems.

Windows XP - as an XP and Linux user (I have my hard drive partitioned to run both) this section was the most useful to me. Here I learned that SP3 was just released for XP, and that Windows has extended support for XP until 2014. These were both important for me. It also provided information on downgrading to XP if you are buying a new computer that comes with Vista standard.

Windows Vista - this section provided important updates that Vista has. Vista has improved their security and fixed most of the compatibility problems it was experiencing when first released. I had never heard of the "telemetry system," so that was an interesting read. Vista employs the "telemetry system" which gathers anonymous information about how customers are using Vista. Information gathered from this was put into SP1.

Windows 7 - the article also talks a little bit about Windows new OS, Windows 7, due to be released in early 2010. They are taking feedback about Vista into consideration for this new OS.

2.) Mac OS X Articles

I have used Macintosh's numerous times before, but have never really studied them or their OS, so this article was the most informative and interesting to me. It was nice to study something I have used before, but know almost nothing about. One of the big things I did not realize about Mac OS X is that it has been used exclusively on Mac's since 2002, and that they have released different versions of it since then. I thought that Mac OS X was similar to Vista - Mac's newest OS and not that Mac OS X 10.6 was their newest OS (similar to how Linux OS are released). Although it is clear to see now, I did not realize Mac was a Unix based OS. OS X is currently used in servers, the iPhone, and iPod Touch, as well as on personal computers.

OS X was a complete overhall and huge step up from OS 9. Some of the better features were its improved ability to run multiple applications, and visual improvements like the Aqua theme. The XCode which supports C, C++, and Java was also new to me. Some newer versions of OS X have had some hardware issues with older computers though (as have Windows OS).

Mac's transistion to Intel processors brings out an interesting point and problem I would like to make about Apple computers. Mac computers transitioned from PowerPC to Intel processors and there have been some compatibility issues between the two.

One of the main problems I have always thought about with Apple computers is that they are basically the exact opposite of open source. Because Mac operating systems can only be run on Apple computers it does not allow for the best minds, and competition to be making the best computers and software. Apple has only a limited number of employees workign on improving their hardware and software. Windows allows for competition because anybody can make a computer using the best parts available and install one of their OS's on that computer. Mac has limited themselves to the hardware and software their corporation can come up with. The transition from an Apple related processor (PowerPC) to a third part processor (Intel) shows that Apple has too limited resources to keep on top of the game for every component of a computer.

Overall, I really enjoyed reading these articles because I did not know alot about Macintosh operating systems.

3.) Garrels, "Introduction to Linux. A Hands on Guide"

I also enjoyed this article on Linux. I am familiar with Linux (I run both Linux and Windows on my computer - though I use Windows approximately 75% of the time) so it was nice for me to read about some of the stuff I didnt' know about Linux. I didnt' know that Unix was developed by Bell Labs or that C was developed specifically for Linux. I didn't realize that Linux was named after a guy named Linus who started to develop it after PC's became powerful enough to run Unix.


Linux is now available on desktops, laptops, servers, PDA's, mobiles, and a host of other things. Linux is now much more user friendly than it was in its beginning days. Linux has even incorporated some GUI's to ease the transition from Windows to Linux. Linux is an open source piece of software. Open source means that anybody can improve or change the software, as long as they keep the original available. Open source generally gets thigns done better and faster, becuase you have a ton of people with different computers and backgrounds workign on making something or solving some problem. Linux is completely free and is extremely secure. One of the problems with Linux and open source is that there are to many distributions. One must make sure that the distribution they wish to install will run on the hardware their computer has.

Muddiest Point Week 2

As I consider myself fairly knowledgeable on the hardware of computers I was not really confused by any of this weeks lecture, and don't really have a muddiest point.

Because of this I will give my input into what I think the other students may have been confused by - or what you could have explained a little more through. Although binary is an extremely complex thing to understand, and the majority of librarians and archivists are never going to need to know binary, I think you could go into a little more detail on how binary works. I think it is always important to have at least a basic understanding of the tools you are working with. Binary is an important part of the computer and could have been explained a little more thorough.

Friday, September 4, 2009

Muddiest Point Week 1

Information Technology - from both the class lecture and readings it seems to me that IT is acquiring, storing, managing, etc. information. IT's goal is to help people create and use information. From the words information and technology this makes perfect sense to me and seems to be the correct definition of IT.

Why then do when most people think of IT in the job market it is generally thought of as the "computer people?" The people that come to your office and fix your computer, get rid of your viruses, install new software. IT in the public's view (at least in the corporate world) seems to refer to computer technicians or computer scientists - the people that fix your computer - and has nothing to do with information. Is this a legitimate description of an IT professional and if not how did this public perception come about?

Week 1 Readings

1.) Vaughan, J. Lied Library

An overview of UNLV's new library building (Lied Library) and the technological advances and upgrades that came with the new building. This article was a good basic reference for anybody in the planning stages of building a new library (academic, public, etc.). This article is fairly basic, and anybody planning a new library would need to do more in-depth research. Overall, this article provides a wide range of categories for planners to look into and research further.
Some of the main points that I found either interesting or odd were:
  • The ability to replace over 600 computers while the library was still open. Key points he made about this were that the staff acquired the software to the new computers early so they cold familarize themselves with it. The storage and delivery company really seemed to help them out with spacing shipments out.
  • Space considerations - as the library goes so does the staff - don't forget about the staff's increasing space needs while plannign a library
  • Don't underestimate security - both physical and computer security
The one main problem I had with the article is that Vaughan talks a lot about finances and where to get the money from for all of the upgrades, security, etc. needs. He offers no solution on how to obtain the money, and does not tell you were Lied Library received their money from.

2.)Lynch, Clifford Information Literacy and Information Technology Literacy

Lynch
(that word won't unitalicise for some reason ) believes tehre are to general perspectives in Information Technology Literacy
1. Skills in the use of tools (Word, files, etc.) - these are more superficial and info tech needs to be more complicated than this
2. How technology and infrastructure work - principles of the technological world - use this in the broad view, not just computers - very limitedly taught in schools - he aruges that this is important to everyone not just those in a field related to it

This article was a good overview of both information literact and information technology literacy. I really liked how he talked about how information technology literacy effects information literacy.

3.) OCLC report: Information Format Trends

I really enjoyed this article and thought it was well written. It provides a good overview of their point that: modern day consumers no longer care how they get their information (the container) just that they get it (the content). The article refers to the consumers as "format agnostic," and that content is no longer format dependent. Libraries and other content sellers must accomadate and change their views to this new consumer demand.

The section on Marshall McLuhan was especially intersting to me. In one of my upper level undergraduate classes i wrote a research paper on the "Rhetoric of the Millennials" (the teen/lower 20's generation that grew up in the late 90's and 2000's - the generation after Generation X) in which I extensively studied McLuhan and "Understanding Media." I thought the authors did a good job of integrating McLuhan's "the medium is the message" into this article, and that text is the internet's media.

I had never heard of "payload" e-mail before and liked the way they used this concept to interpolate data.

The article did a good job in bringing the idea that their is a major social change underway to light. It also brings the problems of social publishing to light; the fact that there are no licenses for blogs or wikis.

The articles says that libraries used to be unparalleled collectors of content, and that this is no longer true. As a society we no longer lack content, but as the digital world continues to grow we are low lackign context.


Test Post

Test Post for LIS 2600 Blog