Using WebClient with Basic Authentication and Forms Authentication

I had the chance to investigate how we could automate downloads from a couple of websites. The current process is excruciatingly manual, and ripe for errors (as all manual processes are).

I first went to check out the websites to see what we were dealing with here. Is there an API that could be used to pull the files down? Is there some sort of service that they could provide to push the files to us? No. And no, of course.

So no API. No clean way to do it. I’ll just have to login programatically and download what I need. Of course the two sites I was accessing had completely different implementations.

The first one was pretty easy. It just uses basic authentication and then allows you to proceed. Why a public-facing web application uses basic authentication in 2015 I don’t know, but I guess that’s another conversation.

Here’s how I implemented it. I also needed to actually download the file by sending a POST to a particular URL. I needed to save it somewhere specific so that’s included as well.

            Uri uri = new Uri(_authUrl);

            var credentialCache = new CredentialCache();
            credentialCache.Add(
              new Uri(uri.GetLeftPart(UriPartial.Authority)), // request url's host
              "Basic",  // authentication type. hopefully they don't change it.
              new NetworkCredential(_uname, _pword) // credentials 
            );

            using (WebClient client = new WebClient())
            {
                client.UseDefaultCredentials = true;
                client.Credentials = credentialCache;

                System.Collections.Specialized.NameValueCollection formParams = new System.Collections.Specialized.NameValueCollection();

                // This is the stuff that the form on the page expects to see. Pulled from the HTML source and javascript function.
                formParams.Add("param1", "value1");
                formParams.Add("param2", "value2");
                formParams.Add("param3", "value3");
                formParams.Add("filename", _downloadFileName);

                byte[] responsebytes = client.UploadValues(_urlForDownload, "POST", formParams);

                // Write the file somewhere? NOTE: location must exist. May want to do something to make sure of that when implementing exception handling
                if (!Directory.Exists(_fileDownloadLocation))
                    Directory.CreateDirectory(_fileDownloadLocation);

                File.WriteAllBytes(string.Format(@"{0}\{1}", _fileDownloadLocation, _downloadFileName), responsebytes);
            }

The other website used Forms Authentication in its implementation. While this was a welcomed difference (since, again it’s 2015), it did make it a little bit more difficult.

I couldn’t just use C#’s WebClient again because it doesn’t deal with cookies. And most applications on the internet use sessions, cookies, and other such hackery to keep track of you and make sure that you’re really logged in and are who you say you are.

I found an implementation of what seems to be called a “cookie-aware WebClient.” I don’t recall which site I got it from, but many implement it in a very similar way. Here is the code for a class called WebClientEx. It simply extends WebClient:

    public class WebClientEx : WebClient
    {
        public WebClientEx(CookieContainer container)
        {
            this.container = container;
        }

        private readonly CookieContainer container = new CookieContainer();

        protected override WebRequest GetWebRequest(Uri address)
        {
            WebRequest r = base.GetWebRequest(address);
            var request = r as HttpWebRequest;
            if (request != null)
            {
                request.CookieContainer = container;
            }
            return r;
        }

        protected override WebResponse GetWebResponse(WebRequest request, IAsyncResult result)
        {
            WebResponse response = base.GetWebResponse(request, result);
            ReadCookies(response);
            return response;
        }

        protected override WebResponse GetWebResponse(WebRequest request)
        {
            WebResponse response = base.GetWebResponse(request);
            ReadCookies(response);
            return response;
        }

        private void ReadCookies(WebResponse r)
        {
            var response = r as HttpWebResponse;
            if (response != null)
            {
                CookieCollection cookies = response.Cookies;
                container.Add(cookies);
            }
        }
    }

And its usage for me is as follows:

            CookieContainer cookieJar = new CookieContainer();

            HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(_urlForLoginPage);
            req.CookieContainer = cookieJar;
            req.Method = "GET";
            Uri uri;

            // First send a request to the login page so that we can get the URL that we will be redirected to, which contains the proper
            // querystring info we'll need.
            using (HttpWebResponse response = (HttpWebResponse)req.GetResponse())
            {
                uri = response.ResponseUri;
            }

            // The c# WebClient will not persists cookies by default. Found this WebClientEx class that does what we need for this
            using (WebClientEx ex = new WebClientEx(cookieJar))
            {
                var postData = string.Format("USER={0}&PASSWORD={1}&target={2}", _uname, _pword, _urlForDownload);
                var resp = ex.UploadString(uri, postData);

                // Note that useUnsafeHeaderParsing is set to true in app.config. The response from this URL is not well-formed, so was throwing
                // an exception when parsed by the "strict" default method.
                ex.DownloadFile(_wirelessToWireline, string.Format(@"{0}\FILE1-{1}.TXT", _fileDownloadLocation, DateTime.Now.ToString("yyyyMMdd")));
                ex.DownloadFile(_wirelineToWireless, string.Format(@"{0}\FILE2-{1}.TXT", _fileDownloadLocation, DateTime.Now.ToString("yyyyMMdd")));
            }

You’ll often hear of people struggling with the 401 redirect that is sent back. It’s basically the server sending back the challenge for credentials. In my case, I needed to send the request and get the information that was appended to the querystring anyway, so it was handy. I then posted the data to the form that the application would be expecting, and downloaded my file.

Also note that the server I was downloading the information from sent back the response in a way that the .NET framework didn’t like, by default. So I had to set useUnsafeHeaderParsing to true. This was an acceptable risk for me. Make sure that you know what it means.

This took longer than I care to admit to implement, but once I found and understood the “cookie-aware” concept, it worked out pretty well.

Directory.GetFiles VS Directory.EnumerateFiles

Where I work, we have fairly large archives of files due to the large volume of messages received from various clients. Most of these messages are received through a VLTrader or ASP(M)X front-end. Eventually they are archived onto the file system according to some pre-determined process. The teams supporting these archives had grown concerned about the ever-increasing amount of storage required for these files. There are thousands of directories (nested many levels deep) and hundreds of thousands of files, potentially millions.

I was asked to help come up with a solution for this problem. The app needed to be configurable when run to specify the root directory and the number of days back to check the date on the file. I needed to allow them to specify that all files older than 90 days should be deleted, for example.

My initial reaction was to use the excellent (and very convenient) System.IO.Directory.GetFiles and System.IO.Directory.GetDirectories methods to simply get an array of the files and directories I would need to enumerate in order to accomplish the task. So I wrote a quick app, utilizing these methods, and saw the IOPS go crazy for a while, then do nothing, then go crazy again. All the while, not much was being accomplished. The issue, as anyone who has tried to “browse” the file system using Windows Explorer may tell you, is that getting the properties of the entire tree, including the number/size of directories and number/size of files, is quite an expensive process.

After doing a bit more research, I came upon the Directory.EnumerateFiles method, which (you guessed it) returns an enumerable collection of file names in a specified path, as opposed to Directory.GetFiles, which returns an array of file names in a specified path. The difference when checking a path with thousands of directories and hundreds of thousands if files is huge. In fact you don’t even have to have that many files/directories to see a dramatic difference. This is only available in .NET 4.0 and above. I have seen others suggest ways of doing something similar with the win32 API, but it was much easier for me to make sure I had .NET 4.0 available than it was to try and implement something using the win32 API.

Usage is simply:

foreach (string file in Directory.EnumerateFiles(rootDirectory, "*", SearchOption.AllDirectories))
                    ShouldDeleteFile(file);

When using these methods, be sure that proper permissions are available on the entire tree. See this post at Stack Overflow for more information. Otherwise you may get an exception. Speaking of permissions — part of my requirement was that I was supposed to delete all files more than 90 days old and all directories which were empty. To avoid any potential conflicts with permissions and/or file properties, the application will run as an administrator and

File.SetAttributes(filePath, FileAttributes.Normal);

is being set each time through. I’m not sure of the performance penalty this may result it. I’ll have to research and see what the hit would be.

Parallel.ForEach

I’ve spent some time off and on over the last year or so writing various versions of web crawlers to get different information off of the web. Some of it for a potential business idea, some of it just to learn a few things. One thing I had a hard time trying to figure out was how to deal with threading. I have a list of URLs that I wanted to crawl, but I had specific things that I wanted to try and do with each one, and there were various counters I was incrementing. Plus me and threading don’t jive that well I’ve found. Maybe I’m just not smart enough for it, who knows.

As I was doing my research/learning/reading about C# in general, I ran across the excellent Parallel Processing blog from MSDN. I was fascinated by the Microsoft Biology Foundation and how they were using the parallelism support in .NET 4. The blog is a good read in general. Those guys are a bit too smart for me to keep up with, but it’s fascinating nonetheless.

I’ll let the smart guys at that blog explain it better than I can, but Parallel Processing allows you to execute additional threads if you have additional CPUs available. It’s important to note that you will not gain from this technique if some other outside resource is what is slowing down your processing. But in my case, I am going out to a website and pulling information from different pages. Parallel Processing allowed me to do this much faster than a regular foreach loop. Good stuff.

SQL Server Date Functions

Here are some handy date functions that I find myself looking up occasionally (especially the “last day of”-type things):

—-Today
SELECT GETDATE() ‘Today’
—-Yesterday
SELECT DATEADD(d,-1,GETDATE()) ‘Yesterday’
—-First Day of Current Week
SELECT DATEADD(wk,DATEDIFF(wk,0,GETDATE()),0) ‘First Day of Current Week’
—-Last Day of Current Week
SELECT DATEADD(wk,DATEDIFF(wk,0,GETDATE()),6) ‘Last Day of Current Week’
—-First Day of Last Week
SELECT DATEADD(wk,DATEDIFF(wk,7,GETDATE()),0) ‘First Day of Last Week’
—-Last Day of Last Week
SELECT DATEADD(wk,DATEDIFF(wk,7,GETDATE()),6) ‘Last Day of Last Week’
—-First Day of Current Month
SELECT DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0) ‘First Day of Current Month’
—-Last Day of Current Month
SELECT DATEADD(ms,- 3,DATEADD(mm,0,DATEADD(mm,DATEDIFF(mm,0,GETDATE())+1,0))) ‘Last Day of Current Month’
—-First Day of Last Month
SELECT DATEADD(mm,-1,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) ‘First Day of Last Month’
—-Last Day of Last Month
SELECT DATEADD(ms,-3,DATEADD(mm,0,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0))) ‘Last Day of Last Month’
—-First Day of Current Year
SELECT DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0) ‘First Day of Current Year’
—-Last Day of Current Year
SELECT DATEADD(ms,-3,DATEADD(yy,0,DATEADD(yy,DATEDIFF(yy,0,GETDATE())+1,0))) ‘Last Day of Current Year’
—-First Day of Last Year
SELECT DATEADD(yy,-1,DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0)) ‘First Day of Last Year’
—-Last Day of Last Year
SELECT DATEADD(ms,-3,DATEADD(yy,0,DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0))) ‘Last Day of Last Year’

—-TodaySELECT GETDATE() ‘Today’—-YesterdaySELECT DATEADD(d,-1,GETDATE()) ‘Yesterday’—-First Day of Current WeekSELECT DATEADD(wk,DATEDIFF(wk,0,GETDATE()),0) ‘First Day of Current Week’—-Last Day of Current WeekSELECT DATEADD(wk,DATEDIFF(wk,0,GETDATE()),6) ‘Last Day of Current Week’—-First Day of Last WeekSELECT DATEADD(wk,DATEDIFF(wk,7,GETDATE()),0) ‘First Day of Last Week’—-Last Day of Last WeekSELECT DATEADD(wk,DATEDIFF(wk,7,GETDATE()),6) ‘Last Day of Last Week’—-First Day of Current MonthSELECT DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0) ‘First Day of Current Month’—-Last Day of Current MonthSELECT DATEADD(ms,- 3,DATEADD(mm,0,DATEADD(mm,DATEDIFF(mm,0,GETDATE())+1,0))) ‘Last Day of Current Month’—-First Day of Last MonthSELECT DATEADD(mm,-1,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0)) ‘First Day of Last Month’—-Last Day of Last MonthSELECT DATEADD(ms,-3,DATEADD(mm,0,DATEADD(mm,DATEDIFF(mm,0,GETDATE()),0))) ‘Last Day of Last Month’—-First Day of Current YearSELECT DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0) ‘First Day of Current Year’—-Last Day of Current YearSELECT DATEADD(ms,-3,DATEADD(yy,0,DATEADD(yy,DATEDIFF(yy,0,GETDATE())+1,0))) ‘Last Day of Current Year’—-First Day of Last YearSELECT DATEADD(yy,-1,DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0)) ‘First Day of Last Year’—-Last Day of Last YearSELECT DATEADD(ms,-3,DATEADD(yy,0,DATEADD(yy,DATEDIFF(yy,0,GETDATE()),0))) ‘Last Day of Last Year’

I originally found them on the excellent SQL Authority blog.

HCPCS 2011 ICD9 Codes

There’s been a bit of activity on the OpenEMR lists lately about the ability to import the ICD9 codes into the application. Apparently there are some Perl scripts which go out to a particular website, extract the data, and pull it down for the application to use. I’ve been wanting an excuse to try the Parallel.ForEach functionality in C# 4.0 and see how it works with threading. This provided a perfect opportunity to write a quick program which would go out, parse the site and data, and pull it down. In addition to the Parallel functions, I’ve also used the excellent HtmlAgilityPack to parse the data.

I’m not exactly sure about where the data ends up yet (I’m not as familiar with the OpenEMR data model as I should be), so all I have for now is a tab-delimited text file which simply contains the code “type” (all HCPCS in this case), ICD9 code, and its description. I’ll have to poke through the OpenEMR code and database in the coming days and see what is done with the data. Perhaps then I can create a SQL file that someone can then load in phpMyAdmin inside of OpenEMR.

The file is located here: hcpcs 2011 ICD9

Configurable EndPoint for WCF Connecting to Authorize.NET’s ARB

Configuring a WCF service in a class library has been something that has been a struggle for me in the past. There was always something that I knew should be done differently, as it just didn’t “feel” right to have to recompile the class library when we move from a test environment to a production environment.

This specific example uses WCF to connect to Authorize.NET’s ARB service for creating subscriptions.

Here is what I came up with:

// Be sure to configure this in the database for the various environments, as needed
EndpointAddress ea = new EndpointAddress(YourDataAccess.GetUrl);

// HTTPS
BasicHttpBinding serviceBinding = new BasicHttpBinding(BasicHttpSecurityMode.Transport);
serviceBinding.ReceiveTimeout = new TimeSpan(0,0,0,20);
ARB.ServiceSoapClient service = new ARB.ServiceSoapClient(serviceBinding, ea);
ARB.ARBCreateSubscriptionResponseType response;

// Set the credentials
authentication = new ARB.MerchantAuthenticationType();
authentication.name = this.AuthNetName();
authentication.transactionKey = this.AuthNetTxn();
response = service.ARBCreateSubscription(authentication, sub);

Force reboot of a remote server

This is for my own reference more than anything. I often (too often, it seems) find myself needing to remotely restart a server. Most of the time it is because the server did not fully go down when I issued the reboot command from the console.

shutdown /s /m \\remotemachinename
shutdown /r /m \\remotemachinename

shutdown /s /m \\remotemachinename — shutdown

shutdown /r /m \\remotemachinename — reboot

Dell XPS M1330 Wireless Problems

I recently purchased a Dell XPS M1330 laptop. Great machine. Really no issues. Other than one huge one. The wireless connection will suddenly drop at random times. Sometimes it will come back. Sometimes it will come back after manually resetting the connection. Sometimes it will come back only after a full reboot. Sometimes it will come back after holding a seance. Sometimes it just won’t come back.

Searching the internet has helped me understand that I’m not the only one with this issue. It seems to be a a fairly common (or all-too common, at least) issue that users of these laptops experience. It may or may not be related to running Vista. Who knows. I do have one other Dell laptop running Vista Ultimate that has no issues whatsoever. And a ThinkPad. And an iMac. So the router is just fine. In fact, I even bought a new router not too long ago, thinking that this would help with the issue.

So I spent over an hour on the phone/connected with Dell’s tech support in India. They were going through the motions of removing and then reinstalling the drivers, which I told them I had already done. They changed a few other configuration things, which didn’t help at all. The end result of the conversation was that I would need to *pay* someone to help me further, since it is a “software issue” and “not related to the hardware.”

Me: “So you guys send me a laptop that has had this problem since I pulled it out of the box, and now you’re telling me that I need to pay someone to come have a look at it.”
Dell: “Yes. We only warranty the hardware.”
Me: “I didn’t put any of this other software on here. It came to me this way from the factory.”
Dell: “It is not a hardware issue, sir. I cannot help you. But you can call and schedule an appointment with . . .”

If there’s one thing that I’ve learned from dealing with Dell’s “customer service” folks, it’s that they only read from a script, and cannot help you beyond that. It’s almost like they use the cultural differences between Americans and Indians to their advantage, so they can slyly imply that I am rude, and they are trying to help me. So I just said thank you and hung up.

Hopefully Dell can get their act together with not only this particular issue, but their glaring lack of customer service. I love this machine. But I do not have the patience to deal with this incompetence. Do any of my 1.3 readers have any suggestions?

Update: So I did a bit more research on this, and the problem appears to be associated with the Dell wireless card. Specifically, the Dell Wireless 1505 Draft 802.11n WLAN Mini-Card. So I went to eBay and ordered Intel’s 4965 card. Hopefully this will help to solve my issue. And hopefully I am smart enough to pop out the old one and put in the new one. Dell’s support is still horrible. All they wanted was to have me pay for someone to come out and probably uninstall/install the same drivers again. They just need to come out and say that this card is a piece of junk.

SQL Server 2005 Exporting Results as tab-delimited

I’ve had this situation come up a few times. The standard “Save Results As” dialog basically gives you the option of saving the results as a CSV file. My results happened to have commas in the data and I also needed the column headers. Seems simple enough, but I couldn’t figure out how to do it. Then I stumbled upon this. What I was missing from the equation was telling Management Studio to export the results as text. So I did this:

Results to Text

Results to Text

Once I got all of that configured, I exported the file as a CSV, opened up Excel, and “imported” the data as I would have any other CSV. Because I had put the setting as tab-delimited, I was good. Seems obvious, but I had to write it down for the next time I need it.

Saving a bar code image to JPG

I’ve used the excellent iTextSharp library to generate PDFs for different projects. It works very well and has been an excellent tool. One of my recent projects had me needing to generate bar codes for use in a rebate application. The bar code would be the unique rebate ID, used by the mail room scanner to streamline and accelerate the data entry and processing. There are other libraries out there, but since I was already familiar with iTextSharp and knew that it included bar code libraries, I decided to try it out. It was so easy it was nearly ridiculous.

I decided to implement it as an HttpHandler, so that it could be accessible by different applications (including my own). In addition to the bar code, the calling application would also require being passed a unique ID along with some identifying information, which would give minimal security to the page.

It went something like this:

Page.aspx?id=123456&z=12345

Where the 2 parameters would form a unique key that would allow the user to lookup information and get the desired bar code. Inside Page.aspx, I have it calling something like this:

Here is the code for BarCode.ashx:

        public void ProcessRequest(HttpContext context)
        {
            string _barCodeId;

            if (context.Request.QueryString["id"] != null)
            {
                _barCodeId = context.Request.QueryString["id"].ToString();
            }
            else
            {
                throw new ArgumentException("No Bar Code ID specified");
            }

            context.Response.ContentType = "image/jpg";

            System.IO.MemoryStream strm = new System.IO.MemoryStream();
            iTextSharp.text.Document doc = new iTextSharp.text.Document(iTextSharp.text.PageSize.A4, 50, 50, 50, 50);
            iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(doc, strm);
            doc.Open();

            iTextSharp.text.pdf.PdfContentByte cb = writer.DirectContent;
            iTextSharp.text.pdf.Barcode128 code128 = new iTextSharp.text.pdf.Barcode128();
            code128.Code = _barCodeId;
            code128.StartStopText = true;
            code128.GenerateChecksum = false;
            code128.Extended = true;

            code128.CreateDrawingImage(System.Drawing.Color.Black, System.Drawing.Color.White).Save(context.Response.OutputStream, System.Drawing.Imaging.ImageFormat.Jpeg);
        }