Spring 2010




On 1/21/2010 4:12 PM, 

Hello Ted!
I've been working on the first lab, and have gotten hung up at the very very end. :)  On question number 5, I have collapsed all the wage data into average wages, by occupation.  Then I merged this variable back into my original dataset.  For the life of me, though, I can't figure out how to get STATA to list the information categorized by occupation so that I can see the number in each category.  I feel sure this is very easy, but I'm drawing a blank.  When I try the command you suggested on the lab sheet, "list if n>50," STATA keeps telling me it doesn't recognize "n."  Does that make sense?
In the collapse command, n will be the count of the # of workers in the occ.

gen n=1

collapse (mean) fem wage* (sum) n, by(occ)






lab 1 advanced (voluntary)



#qND: When I try to run the do file that we create in lab I run until the command gen occ_cat=occ when Stata tells me "occ ambiguous abbreviation"? What's going on?


A:(Ted) it means you need to specify the variable.  Try this instead:

des occ*


(the reason for the problem is I must have changed the MORG file slightly between when I wrote the lab and the version that you have)



survey questions:

Q: how to change a variable's format? Like I create a variable whose format is string, but "label" command cannot apply to this string format.

What should I do to change its format so it can have 1 for 'male' 0 for 'female' etc.


A (Ted): use the destring command.  Example:

destring price, generate(price2)

Q: You mentioned that the asterisks was a wildcard. I've also used it for multiplication. Will you talk about rules for when and how to use it?

A(Ted): in a list, it is a wildcard.

Example:  des occ*

in a gen command (or something similar) it is multiplication

i.e.,  gen a=b*c

di 3*4




Q: Is it always necessary to redefine variables when collapsing data, or is that a stop taken to assure clarity of results. i.e. could we write collapse (mean) sex wage (count) ooc_n=sex by(occ)?

A(Ted): Not necessary to redefine variables.  Useful if you are creating several variables from the same thing.


collapse (mean) m_wage=wage  (sd) sd_wage=wage (count) n_wage=wage, by(occ)


This may be helpful to others. I'm using Stata through Citrix on my Macbook. When I hit a multipage display, it displays "-more-" at the bottom of the  Stata screen. The problem is when I have a large dataset loaded and enter a stupid command that causes pages of data to display, and I have to hit the spacebar to refresh the screen. Instead I now enter this command at the start of my session: "set more off". (Ted has this in his do-file example.) This will cause the display to scroll without user intervention. It's easy to scroll back in the history to copy any data you need.


A(Ted) another thing you can do is hit the break button, which is the X icon right below the "window" dialogue.