The deep or “invisible” Web consists of those resources that, while accessible over the Web, are not subjected to search engine indexing. Imagine the sites you have visited that have a search box on them. If the search retrieves Web pages, those are probably in Google or Bing. But if it’s returning directory listings or other information that comes from a database, it may not be.
I came across Complete Planet in a list of “alternative” search engines. You can search Complete Planet’s directory of databases for relevant matches or you can browse by topic. The “Law” category has over 1100 entries. It contains common sites, like Findlaw or Lawyer.com, but not necessarily a lot of high quality ones. The site may only be a demo for its deep web search owner, BrightPlanet, as most entries in the legal category haven’t been indexed since 2004, and a number of the ones I clicked were no longer there. However, it may be worth noting since it covers a wide variety of topic areas and might help you to identify a source, even if the directory entry is no longer there.
This change in service from DeepDyve caught my eye. They have provided a method of reading “deep Web” content for awhile. Normally, you would do a search and pay to read the article it unearths. Now, you can search DeepDyve and you have five minutes to read.
DeepDyve is light on legal information but its depth in academic works offers a wealth of social science research that may be applicable to cases. Type in “law journal” or “law review” to see the short list of available titles. When you do a keyword search, you can immediately see which articles are rentable.
There has been some discussion about how researchers now skim research rather than read from end to end. This would seem to be a good match for that change, enabling people looking for information chunks to quickly see if they have found what they are looking for.
Google Reader’s imminent departure is a great opportunity. It is like cutting down a large overgrowth of kudzu that may enable other interesting options to grow and flourish. One that interests me is the open source Tiny Tiny RSS server. If you have more than one person in your organization who follows RSS feeds or who might want to, this could be an excellent way to centrally offer this service.
Tiny Tiny RSS runs on the same LAMP / WAMP technology that runs WordPress. It requires the same technology skills. This means it’s a bit more advanced than a desktop application you download and install but it by no means requires heavy duty programming chops. I was able to download and get Tiny Tiny running in about 30 minutes on Ubuntu.
There are other guides, although a bit dated, for other systems. To install on Ubuntu, assuming you already have Apache 2, MySQL, and PHP 5 installed:
1. Download the basic files and extract them into the folder from which they’ll run. I placed mine in a subfolder of my WordPress installation, so that I could re-use my current domain name and just treat it as part of my overall site;
3. Then, following that set of instructions, insert the necessary SQL information into the new database: mysql -u ttrssuser -D ttrssdb -p < schema/ttrss_schema_mysql.sql Obviously change the username and database to the ones you created. Look for the schema folder within the folder where you extracted Tiny Tiny.
4. Read the README.md file, copy the config.php-dist file to config.php, and complete the necessary information about your database username, password, database name, and server. You can also turn on the “simple” updating method. There is an automatic update function using a daemon, but the simple will work for a small site. Update: here’s another installation checklist for Ubuntu but it also has the simplest explanation I’ve seen for activating the daemon.
5. I didn’t see this mentioned in any of the tutorials but you also need to secure the files themselves. In the folder where you extracted the files from step 1, make sure you set the ownership and rights. I again copied WordPress, so my Tiny Tiny installation uses the www-data user. The files and directories should be as secure as you can make them: chmod files 644 and directories 755.
At this point you’re ready to go. I went to http://mydomain/tiny-folder-name and saw the login screen. I logged in (username: admin, password: password) and changed the password and created a new user.
Add Your Users
This is one of the nice things about Tiny Tiny. You can have more than one person using the server, with their own account and their own news feeds. You can import your old Google Reader subscriptions.xml file under the OPML setting and there are a lot of other customization you can apply.
There is a lot of functionality under the hood. You can customize the CSS to make it look the way you like across your entire installation, set up e-mail digests of information, control how many posts are stored and more. Features I like:
Easy to read all unread messages and mark all read;
Sharing tools built into each message, so I can activate plugins and send to Google+ or send as an e-mail to someone else;
Tiny Tiny will apply Google Reader tags when you import but you can also apply your own to categorize feeds;
Clicking on the title of a document will open the original post in your Web browser;
Like Omea, you can add annotations to a post, so that you can add additional context to it;
There is a public sharing function, so that a Tiny Tiny installation within an organization could be used by a research team to share posts with lawyers and others who otherwise wouldn’t be monitoring the RSS
It’s an incredibly light application. Tiny Tiny RSS is entirely Web-based, so it will work in any Web browser on tablets or computers. I have not tried it on a phone – it should work but I’m not sure the experience would be very enjoyable.
A single message displayed in a preview window below the unread messages in Tiny Tiny RSS
Google’s cancellation of Reader and the general state of confusion that the RSS reader world is in makes a tool like Tiny TIny more compelling. It allows you to ensure availability of this powerful research tool and it can be easily made available to multiple lawyers or researchers in your law firm. It’s open source as well, so your IT staff can customize it specifically for your firm as well as understand exactly what’s going on under the hood. Tiny Tiny is not like the social, image-heavy RSS readers that are proliferating, particularly in the mobile app market. Instead, it can be a heavy duty replacement for Google Reader.
Google has announced the sunset of Google Reader. It has been my primary news reader for years and I’ve continued to stick with it, even when it lost some functionality with the shift to Google+. The decision to get rid of it means finding a decent replacement but I’m probably going to have to change my reading habits.
There are loads of very good RSS clients. Unfortunately, many of them are mobile – see Flipboard and Pulse, for example – and shift away from the universal access I enjoyed with Reader. Some services, like Feedly, offer RSS support over the Web and a mobile app. I took a quick look at Feedly but can’t figure out how to access feeds without a Google Reader linkage.
Some of them also lack the ability to import your current RSS subscriptions, which you’ll be able to export from Google Reader as an OPML file. Feedly allows importing OPML through a work around – which requires Google Reader! The ubiquity of Google Reader meant that a number of the other RSS feed readers relied on it. If you read an RSS news item in one reader, it won’t necessarily be marked as read in another one. These other readers would synchronize your activity with Google Reader.
Alternatives to Google Reader: Desktop, E-mail, Browser
Mac users can try NetNewsWire, which also works with iOS devices. Ubuntu users might look at Liferea for a straightforward desktop RSS reader. Your e-mail software can also sometimes act as an RSS reader. Microsoft Outlook can track your feeds and you can add feeds to Mozilla’s Thunderbird, although it’s reaching the end of its life as well.
Your Web browser may also have a good RSS extension. It won’t provide you universal access but it can enable you to re-use your current technology. For those of us in organizations where we may not be able to install new software, this may be a good option. Mozilla Firefox users should take a look at Sage or Brief.
Another option may be to use a portable RSS reader. Portable Apps has a packaged version of the QuiteRSS reader. It will import your RSS reader and you can take it with you and run it on your current computer.
I’ve already downloaded my OPML file from Google Takeout and am moving on. I’m probably going to go with Omea Reader. It will change my work habits – I’ll probably read my RSS less often away from work – but it has a lot of powerful features that should help me to manage the information that I come across better.
Update: No, I’m not. Something’s not quite right with Omea and it’s not updating properly [I decided to go cold turkey this morning, so totally flipped off Google Reader]. I’m liking using Brief + Firefox at the moment, and am wondering if I can use Firefox’s sync feature to keep my unread information updated across machines.
At one point in time, there were a number of sites trying to provide search to information as it came whistling by on social media streams. Most of them have gotten out of the business or, if they have a social search, it’s not necessarily that current. Kurrently caught my eye because it seems to provide a fast rolling response to any search you put into it. It retrieves messages posted to Facebook, Twitter, and Google+.
To be honest, I was a bit skeptical so I put in a hashtag that I was following on Twitter and watched the search results on Twitter and the stream on Kurrently. At least in this case, Kurrently was displaying results before Twitter was, although it was a matter of a few minutes so it may have just been a matter of freshening my browser.
Kurrently can filter out messages from any one of the three buckets it is monitoring, so you can limit the stream to just Facebook or Google. You can also speed up or slow down the stream, in case it’s roaring past or just dripping like water torture. You can also bookmark your search term – just as you can by bookmarking a search on Twitter – so it would be relatively easy to create a folder of saved topics. However, since the whole goal is to see what’s happening at the moment, I’m not sure bookmarking on Kurrently makes a whole lot of sense.
I’m adding Kurrently to my toolkit when I want to watch a broad topic that is likely to be discussed in more than one of the main social media locations, or as a quick dive into a discussion or for a sense of sentiment.
If you are looking for what a Web site used to look like, the Wayback Machine from the Internet Archive is an excellent starting point. They have released a beta of a new version of this useful tool. The most notable feature is that they get rid of the large calendar you view after your search. Instead, it defaults to a view of the site you’re searching for, and you can then select from a small timeline at the top of the screen for a different date.
If you want to save and mark up Web pages with a tool that’s a bit lighter than Evernote or Microsoft OneNote, Annotary may be an option. It is a browser extension for Firefox or Google Chrome. When you hit a Web page that has content you want to save, you click the Annotary button in your Web browser toolbar and save the page. You can add it to a collection – works like a folder – and you can also add a bookmark.
You can share a single page. You can also create a group of people, sending them an invitation to participate, and share a collection with them. This could work well for librarians supporting a practice group or faculty on a given project. However, the sharing feature seems to be excessively social. There is a way to see who else has annotated or highlighted the same page, and there doesn’t appear to be any way to turn off this sharing.
In fact, I couldn’t find it on the main Web site but you can browse the Bulk portion by going directly to the top URL: https://bulk.resource.org/. Some of the other content hasn’t been updated in the last year, although that may reflect the actual government publication schedule. There are U.S. state public safety regulations (administrative codes dealing with things like elevator installation, etc.) and patent and trademark databases. Once you’ve found a directory that has content you’re interested in, consider using a site delimiter in Google to search it. For example, if I want to find Society for the Prevention of Cruelty to Animals (SPCA) groups, I might search Google using:
Web search is a cornerstone of online legal research. Google, Yahoo!, and Bing remain starting points for lawyers and others to perform research outside of the fee-based providers. Both Google and Bing have put additional social elements into their search engine results pages (SERP). These are intended to both float up results that are included in social networks and that may be more timely. They are also part of increasing shifts towards providing personalized search results, with Google mining information based on your Google account, and Bing latching on to your Facebook account, if you link it to your search account.
Unfortunately, searchers seem to be resistant to these changes. There is good reason. First, relevant social results require linkages to your social accounts. People participate in a social network for a variety of reasons, but it does not necessarily mean that those messages are more relevant. In particular, if your social networks are not related to your research – personal v. professional, for example – they may bear no relation to the topic you are interested in.
Another challenge is that personalized results re-order the results to attempt to fit your previous search activities. If you search across a variety of areas, or mix personal and professional search within the same account, you may find the custom results page is submerging relevant results. You can turn personalized search off in Google – or keep your account unlinked from Facebook in Bing – but otherwise you can find your results skewed. Personalized search may bring some benefits, like the recent addition of search into your Google Mail and Drive contents, but it changes your research.
Bing has seen its search market share drop after introducing social search into their results. Google has had negative reaction to its implementation as well. Legal researchers need to watch for how personalization can impact their search. If you have the need to search social results – blog, Twitter, or that sort of information – you are probably better using a social search engine like Topsy.
Public records are a wealth of information about clients, opponents, and other parties related to cases. They can unearth information about properties and corporations that can be helpful in building a case and creating a litigation strategy. United States researchers have some of the most extensive access to this sort of information. The Legal Skills Prof blog highlighted a story from Law Technology News on TLO, an online service that has reports starting at US$1. If you are doing public records research and the free services aren’t getting you anywhere, this looks like an excellent option.